Re: [dev] The mysterious 31

From: Kris Maglione <maglione.k_AT_gmail.com>
Date: Tue, 3 Aug 2010 23:03:50 -0400

On Tue, Aug 03, 2010 at 09:53:27PM -0400, Jacob Todd wrote:
>In K&R, chapter 6, section 6, there is a funtion called hash that hashes a
>string, which will be stored in a linked list. The function in question is on
>page 144, but here it is for those of you who don't have K&R handy.
>
> /* hash: form hash value for string s */
> unsigned
> hash(char *s)
> {
> unsigned hashval;
>
> for(hashval = 0; *s != '\0'; s++)
> hashval = *s + 31 * hashval;
> return hashval % HASHSIZE;
> }
>
>So what is the purpose of the magic 31? The only thing that I think might be a
>reference to what it may be for is the paragraph prior, which states
>
> The hashing function, ..., adds each character value in the string to a
> scrambled combination of the previous ones and returns the remainder
> modulo the array size.
>
>Does the magic 31 have to do with scrambling?

It's just an arbitrary number that happens to perform well in
that particular hash function. There isn't much science to hash
functions, but some of them prove to have good performance and
good distribution, others, not so. I tend to use Dan Bernstein's
hash function, which uses 33. It's not magic, it's just been
shown to work well. Although, in this case, I suspect that it
was chosen because it optimizes to a particularly fast
addition/binary shift, but so do a lot of other choices.

static ulong
hash(const char *str) {
        ulong h;

        h = 5381;
        while (*str != '\0') {
                h += h << 5; /* h *= 33 */
                h ^= *str++;
        }
        return h;
}

-- 
Kris Maglione
You talk when you cease to be at peace with your thoughts.
	--Khalil Gibran
Received on Wed Aug 04 2010 - 05:03:50 CEST

This archive was generated by hypermail 2.2.0 : Wed Aug 04 2010 - 05:12:02 CEST