Boost logo

Boost Users :

Subject: Re: [Boost-users] boost::hash and string conflicts
From: Erik Scorelle (escorelle.work_at_[hidden])
Date: 2011-04-07 19:38:38


Thanks for the quick responses!

The solution to our particular problem ended up being very simple: just a
few tweaks to the boost::unordered_map functors. I just wasn't quite
thinking about it clearly.

Also, that Dr. Dobbs article was a helpful read.

Cheers

On Thu, Apr 7, 2011 at 2:08 PM, Ovanes Markarian <om_boost_at_[hidden]>wrote:

>
>
> On Thu, Apr 7, 2011 at 7:56 PM, Scott McMurray <me22.ca+boost_at_[hidden]>wrote:
>
>> On Thu, Apr 7, 2011 at 10:08, Erik Scorelle <escorelle.work_at_[hidden]>
>> wrote:
>> > We have been using boost hash to hash filenames, but have found with
>> some of
>> > our user data that certain strings will produce the same hash code (
>> > "0012g6" and "0012fu" for example). Is there a recommended way to
>> predict
>> > or resolve these sorts of conflicts?
>> >
>>
>> Having some collisions is the expected behaviour of a
>> (non-cryptographic) hash function. With a birthday search you can
>> easily find thousands of examples.
>>
>> I'm not sure what you mean by "resolve". Normally, code using hashes
>> expects collisions, and uses a full equality operator to resolve them.
>>
>> If you cannot accept any collisions, you could use perfect hashing
>> (libraries like <http://cmph.sourceforge.net/> or
>> <http://www.gnu.org/software/gperf/gperf.html>) or use a cryptographic
>> hash (I have a usable, but WIP library at
>> <http://svn.boost.org/svn/boost/sandbox/hash/>).
>>
>> ~ Scott
>>
>
> Scott,
>
> AFAIK perfect hashing works only with the known set of keys, otherwise it
> is not possible.
> Just to make a note: crypto-hashes are much slower as normal ones and use
> much more memory. Erik, how many file names are you going to hash? There was
> an article on Dr.Dobbs which compared unordered_map (hash table)
> implementation with std::map. And it turned out that until some certain
> amount of items it makes no sence to use a hash table, because a normal map
> is much faster.
>
> Here is the link: http://drdobbs.com/cpp/198800559?pgno=1
>
>
> With Kind Regards,
> Ovanes
>
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net