Boost logo

Boost :

Subject: Re: [boost] [hash] regular behaviour of hash function for double values
From: Howard Hinnant (howard.hinnant_at_[hidden])
Date: 2012-02-01 10:57:18


On Jan 31, 2012, at 3:17 PM, Andrey Semashev wrote:

> On Tuesday, January 31, 2012 20:06:23 Daniel James wrote:
>>>
>>> From libc++:
>> template <>
>> struct _LIBCPP_VISIBLE hash<int>
>>
>> : public unary_function<int, size_t>
>>
>> {
>> _LIBCPP_INLINE_VISIBILITY
>> size_t operator()(int __v) const _NOEXCEPT {return
>> static_cast<size_t>(__v);}
>> };
>>
>> Last time I tried it, libstdc++ was similar.
>
> Too bad. I really hope these libraries will get improved eventually.
> Otherwise, std::hash is useless aside from std::unordered* containers.

You've alluded to the crux of the issue:

Is std::hash<scalar> a general purpose hash function? Or is it the best default hash function for the std::unordered containers implemented in this same library?

The standard is silent on this question.

A good general purpose std::hash<scalar> will be more expensive, but will likely work well in any application.

A std::hash<scalar> optimized to work with the std::unordered containers will take advantage of knowing the details of those containers, and can be made faster than a general purpose hash function.

As a judgement call, I calculated that I would have *many* more clients who are casual users of the std::unordered containers, defaulting most options such as the hash function, than I would have clients interested in using std::hash<scalar> in their own containers, or in other places a hash function might be useful.

So I stand by the current implementation of std::hash<int> in libc++ as that which will give the vast majority of my clients the highest quality possible. libc++ currently contains both CityHash and murmur2 as well (the former explicitly contributed by Google and properly credited in the CREDITS.TXT file in the libc++ root). And these are chosen for some of the other scalars and in some other situations.

The current hash<int> isn't a product of ignorance or laziness on this subject. It is a carefully chosen solution based on the assumption that the main purpose of std::hash<int> is to serve as a good default for the std::unordered containers.

Implications of that assumption: If you're in need of a hash function for some need other than a good match for the std::unordered containers, std::hash may not be the solution you're looking for. The only std::solutions I'm aware of that might come close to fitting such a need are found in <random>. Non-portable solutions may be found in individual std::lib implementations such as libc++ by searching for things like "cityhash" or "murmur2".

Howard


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk