Boost logo

Boost :

Subject: Re: [boost] [hash] regular behaviour of hash function for double values
From: Daniel James (dnljms_at_[hidden])
Date: 2012-02-01 03:59:56


On 1 February 2012 00:44, Topher Cooper <topher_at_[hidden]> wrote:
>
> values that differ by the table size end
> up in the same bin

Is that likely though? And any more likely than other causes of collisions?

> and values that differ by a small amount end up in
> nearby bins -- which can cause excess collisions for data with some
> corelational structure.

The STL hash containers are pretty much required to be implemented
using a chained hash table.

> Lets turn one of the arguments made on its head.  Why should someone who
> finds it convenient to use a reasonable quality hash on their data type pay
> the overhead of using a container that compensates for the possibility of a
> poor one?  Is it a more reasonable expectation that someone who wishes a
> different balance of quality vs performance then the default should rewrite
> the standard hash or rewrite the standard unordered_map?

There are very good alternative open source implementations out there.
You shouldn't need to rewrite anything.

> Sorry for the length -- I had a lot to say, I guess.

Why do you think I said I didn't want to get into this debate? I do
realise that there's a lot to say, and I don't have any enthusiasm for
saying it. I wrote the thing 7 years ago.

IMO a more appropriate forum would be one of the C++ newsgroups, as
this is more about how the standard should be implemented, rather than
my particular implementation which should really become obsolete over
the coming years. Boost.Hash's advantages over standard
implementations are its portability and its customisation abilities.
The former will hopefully become less relevant, the latter doesn't
seem to have been that popular. In other respects the standard
implementations should be better. I'm not going to compete with them.

If anyone does want to give it a shot, I'd recommend starting from
scratch (probably using existing hash algorithms) or using something
like libc++'s implementation as a starting point. Most of the effort
that's gone into Boost.Hash was to get it working with older
compilers, and the implementation has been wrapped around that.
There's also the problem with implicit casts which is something of a
flaw in the customisation mechanism, so that could do with a rethink.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk