Re: [dev] The mysterious 31

From: David Tweed <david.tweed_AT_gmail.com>
Date: Wed, 4 Aug 2010 05:19:03 +0100

On Wed, Aug 4, 2010 at 5:01 AM, David Tweed <david.tweed_AT_gmail.com> wrote:
> On Wed, Aug 4, 2010 at 2:53 AM, Jacob Todd <jaketodd422_AT_gmail.com> wrote:
> It's also worth remembering that K & R was written at a time many
> decades ago when performance aspects of computer architecture were a
> lot, lot simpler. Apparently they have
>
> #define HASHSIZE 101
>
> which given that there's no really efficient way of computing % for
> arbitrary numbers is going to be quite slow (particularly if the
> strings are short), which is why hashes for modern machines use table
> sizes that avoid needing a mod. (There are other things that are slow
> on modern architectures that modern hash functions avoid.) I'd use K&R
> for the C syntax and some of the higher level ideas of programming,
> but not try to understand good hashing technology from there.

A minor clarification: looking at the gcc assembly if you've hardcoded
the table size it figures out some magic constants so that it takes
around 8 bit-twiddling and subtraction instructions to do the mod
operation, which is slow but not that slow. If you pass the table size
as a variable, which you would in "serious" code you get integer
division based code, which is going to be "quite slow".

But the point about K & R being basically a syntax and high-level
strategy book these days stands.

-- 
cheers, dave tweed__________________________
computer vision reasearcher: david.tweed_AT_gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
Received on Wed Aug 04 2010 - 06:19:03 CEST

This archive was generated by hypermail 2.2.0 : Wed Aug 04 2010 - 06:24:03 CEST