Boost logo

Boost Users :

Subject: Re: [Boost-users] Hash collision for pairs of int
From: Daniel James (daniel_at_[hidden])
Date: 2013-11-27 15:10:04


On 27 November 2013 18:48, Igor R <boost.lists_at_[hidden]> wrote:
>> I'm experiencing a lot of collisions using the hash_combine function in functional/hash/hash.hpp for integer-pairs-as-keys when there are few unique ints in each of the pair's positions (put together the keys themselves are unique of course).
>>
>> I've uploaded a simple enough test case here: http://www.speedyshare.com/Wcem6/hash.cc.gz. If there's a more preferred way to upload, let me know and I'll reupload. With 400K pairs, this returns only 86K unique indices and has a max of 18 collisions in size_t.
>>
>> Am I doing something really wrong here?
>
>
> Perhaps your case is similar to this one:
> http://stackoverflow.com/questions/19966041/getting-too-many-collisions-with-hash-combine

I really need to reconsider the implementation of hash_combine. It was
based on the one in Peter Dimov's paper which hadn't really been
tested with general values. Part of the idea was the hash would behave
similarly on different machines, but I don't think that's really a
desirable goal. It probably should use a specialised implementation
for common sizes of std::size_t (i.e. 32, 64 and 128 bit).

It might even be worth changing to an interface that's more amenable
to more sophisticated hash functions, e.g. one with more state, but I
don't really want to introduce a breaking change at the moment. I
suppose a clever implementation could allow the two to coexist.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net