From: Daniel James (daniel_at_[hidden])
Date: 2005-03-11 07:56:25
Tobias Schwinger wrote:
> I am missing a way to parametrize how hash values are combined.
> This is especially desirable because the skeleton could then be
> exploited to implement double hashing for compound data (a comon
> technique to avoid collisions by varying the hash function) and can be
> useful in other situations (e.g. a small hash table in memory vs. a
> large one on disk).
I think if you want to combine hash values in a different manner, it's
best just write your own version of hash_combine. It could save a little
code if we supplied a version of hash_range that works for different
combining functions, but I don't think it'll be that useful (and would
require that your hash_combine function had the same signature as ours,
you might want to do it differently).
> - What is your evaluation of the documentation?
> Good. Easily understandable.
> Nitpicking: a hash table is not required to store its elements
> in buckets in general - this is a property of the most common way to
> implement hash tables.
Good point. I've been spending too much time looking at the draft
standard for hash based containers (which require buckets).
> - What is your evaluation of the implementation?
> Here are my notes from reading the source:
> The implementation of hash_combine does not need conditional jumps as a
> probably slightly slower variant, which I've seen quite often does. I
> know this variant works empirically well in practice, so will this one,
> I figure.
> However, I believe that hashing speed is generally overrated and that
> there are numerous situations where it's worth to trade some of it for
> fewer collisions (as memory access generally gives more room for
> performance optimization than CPU instructions do).
I do need to spend more time looking into alternative hashing algorithms
for a future version.
> One idea that immediately came to my mind was some "syntactic sugar" to
> ease hashing of class members:
> std::size_t hash_value(point const& p)
> return (boost::hash_value = p.x, p.y, p.z);
> Note: It's just a spontanous idea I am not convinced of myself...
A hash function for boost::tuple would be a good way to do this,
although something along those lines might be more efficient.
> The for loop in hash_float_value is a nice one to be unrolled via
Yes, for anyone who hasn't looked at the code it loops a fixed number of
times based on the accuracy of the float type. Compilers could unroll
that loop themselves, but that's probably too much to expect.
> - Do you think the library should be accepted as a Boost library?
> Yes. Though there should be some research if it is possible to add to
> its genericity as mentioned in the first paragraph (just saw the latest
> discussions and seems we're already at it).
Thanks for your review.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk