Re: [dev] The mysterious 31

From: David Tweed <david.tweed_AT_gmail.com>
Date: Wed, 4 Aug 2010 05:01:56 +0100

On Wed, Aug 4, 2010 at 2:53 AM, Jacob Todd <jaketodd422_AT_gmail.com> wrote:
> In K&R, chapter 6, section 6, there is a funtion called hash that hashes a
> string, which will be stored in a linked list. The function in question is on
> page 144, but here it is for those of you who don't have K&R handy.
>
>        /* hash:  form hash value for string s */
>        unsigned
>        hash(char *s)
>        {
>                unsigned hashval;
>
>                for(hashval = 0; *s != '\0'; s++)
>                        hashval = *s + 31 * hashval;
>                return hashval % HASHSIZE;
>        }
>
> So what is the purpose of the magic 31? The only thing that I think might be a
> reference to what it may be for is the paragraph prior, which states
>
>        The hashing function, ..., adds each character value in the string to a
>        scrambled combination of the previous ones and returns the remainder
>        modulo the array size.
>
> Does the magic 31 have to do with scrambling?
>
It's also worth remembering that K & R was written at a time many
decades ago when performance aspects of computer architecture were a
lot, lot simpler. Apparently they have

#define HASHSIZE 101

which given that there's no really efficient way of computing % for
arbitrary numbers is going to be quite slow (particularly if the
strings are short), which is why hashes for modern machines use table
sizes that avoid needing a mod. (There are other things that are slow
on modern architectures that modern hash functions avoid.) I'd use K&R
for the C syntax and some of the higher level ideas of programming,
but not try to understand good hashing technology from there.

-- 
cheers, dave tweed__________________________
computer vision reasearcher: david.tweed_AT_gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
Received on Wed Aug 04 2010 - 06:01:56 CEST

This archive was generated by hypermail 2.2.0 : Wed Aug 04 2010 - 06:12:02 CEST