Re: Bob Jenkins' hashing implementation in Ruby
From:
Mauricio Fern疣dez <batsman.geo@...>
Date:
2003-03-01 12:59:06 UTC
List:
ruby-core #893
On Sat, Mar 01, 2003 at 08:42:40PM +0900, ts wrote: > >>>>> "M" == Mauricio Fern疣dez <Mauricio> writes: > > M> --- ruby.jenkins.orig/gc.c 2003-02-28 17:16:13.000000000 +0100 > M> +++ ruby.jenkins/gc.c 2003-02-28 18:40:11.000000000 +0100 > [...] > M> --- ruby.jenkins.orig/st.c 2003-02-28 17:16:14.000000000 +0100 > M> +++ ruby.jenkins/st.c 2003-02-28 19:18:15.000000000 +0100 > > Apparently, you have not modified string.c (rb_str_hash()) it's normal ? I only changed the hash in st.c (and propagated side effects in num_bins to other files). I thought that strhash() was always being used (and was wondering about what would happen to a string with '\0' inside...) but now I see that I was wrong and in fact rb_any_hash is used which itself calls to rb_str_hash. The strange thing is that, even though the new hash function was not being used (and BTW, it turned out it's not Jenkins but "one-at-a-time"), I got some speed increase in simple benchmarks. I believe it's because I got rid of the primes and modulus calculations and replaced it with bitwise and... But then, why was that introduced in the first place if it is slower? Perhaps it was needed for some of the "worse" hashing functions, but why would we want to keep them now? I'm changing string.c and benchmarking again. I will try the "one-at-a-time" hash and Jenkins (it's possible here as we know the length of the string) to see what's better. Stay tuned :) -- _ _ | |__ __ _| |_ ___ _ __ ___ __ _ _ __ | '_ \ / _` | __/ __| '_ ` _ \ / _` | '_ \ | |_) | (_| | |_\__ \ | | | | | (_| | | | | |_.__/ \__,_|\__|___/_| |_| |_|\__,_|_| |_| Running Debian GNU/Linux Sid (unstable) batsman dot geo at yahoo dot com Linux: the operating system with a CLUE... Command Line User Environment. -- seen in a posting in comp.software.testing