Boost logo

Boost :

From: Peter Dimov (pdimov_at_[hidden])
Date: 2005-03-13 14:04:53


Topher Cooper wrote:
> On Mar 13, 2005, at 11:13 AM, Peter Dimov wrote:
>> implementation to which you port your program. Your original tests
>> are still valid.
>>
>> The other side of predictability - that the hash function is not
>> affected by types or the particular representation of a sequence of
>> values - means that you don't necessarily have to re-run the
>> performance benchmark of your unordered_maps when you refactor the
>> key to use int[3] instead of struct { short, short, short }, or a
>> wstring instead of a string.
>>
>
> Fundamentally, what makes for a good hash function is that things that
> are equal (however you are defining equality) always hash to the same
> value, while any small difference between two things should cause them
> to hash to completely different values. In an arbitrary program would
> I expect an int[3] to be equal to a struct { short, short, short }
> when the values happened to be equal? No. What if [...]

The intent of the proposal is to provide a reasonable default when
tr1::unordered_map is used (and to provide the tools to build such a
reasonable default for user-defined compound types). It is not an attempt to
build the ultimate hash function (a hopeless task).

Since the keys in an unordered_map are of the same type, it is usually not
possible for one of them to contain int[3] and for another to hold struct
{ short, short, short }.

A discriminated union type (boost::variant) that is able to hold either type
can easily hash_combine the index of the currently active type to avoid
these accidental collisions. Similarly, a boost::any-alike can hash_combine
the address of a structure that uniquely identifies the type, or
typeid().name(). This mirrors their implementation of operator==.

I agree that the current design may not be well suited for a general purpose
hash function that needs to be able to sort objects of arbitrary types into
buckets. I'm not sure how often this occurs in practice. Do you have
examples?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk