Boost logo

Boost :

From: Peter Dimov (pdimov_at_[hidden])
Date: 2005-04-02 06:03:52


First of all, nice work!

On headers:

* boost::hash defined in <boost/functional/hash.hpp>

Usually (when the number of top-level headers is small) we follow the rule
that boost::X is defined in <boost/X.hpp>. This makes it much easier for
users to remember what headers need to be included.

I think that hash.hpp needs to be a top-level header.

* hash/hash.hpp contains hash_range overloads for Range objects

I don't recall this being discussed during the review.

This introduces a dependency on the range headers. Given that std::pair is
separated in its own header, this is a bit odd, especially when the range
headers include <utility>.

But what's more important, these overloads are unnecessary. The proper
protocol to obtain the hash value of an object x is hash_value( x ), not
hash_range( x ). Consider the case std::vector<X>. If X does not support
hash_value, hashing the vector will fail, even in the case where hash_range
could have been used for X.

If a range object is not supported by hash_value, we just need to add a
hash_value overload for it.

Summary: these overloads need to be removed.

* hash.hpp does not contain overloads for built-in arrays

This is a case that I missed when I prepared the hash_value proposal. We
need to add them.

* boost::hash derived from std::unary_function

This introduces a completely unnecessary dependency on <functional>.
boost::hash only needs to define result_type and argument_type.

* Header separation

Reviewers have expressed concerns that including all standard container
headers would incur a significant hit. Is this the case in practice?

Currently, hash/hash.hpp results in 44247 lines, approx. 512K. The top-level
hash.hpp results in 51096 lines, approx. 645K. In my opinion, this does not
constitute an unacceptable increase.

In addition, defining all std-related overloads before hash_combine will
eliminate the need for the call_hash workaround.

* The pointer overload for hash_value should probably be changed to take a
void*

The reason for this is that on some platforms an int* and a void* to the
same object may have different representations. On the other hand, this will
disable the overload for function pointers. Maybe we need to leave the
signature as-is but cast object pointers to void cv * internally. Or maybe
the current implementation is good enough. Base* and Derived* to the same
object may generally hash to different values anyway.

On tests:

* Tests do not test the specification

The tests are legitimate, but overly general. They test only the broad
hash<>::operator() requirement, not a specific value. But our specification
states precisely what hash value is to be expected (pointers and floats
aside).

hash<X>( x ) needs to be tested against hash_value( x ), and hash_value( x )
needs to be tested against its expected value.

* custom_test

The portable definition of hash_value for custom objects is something along
these lines:

namespace test
{

struct custom
{
#ifndef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP

    friend inline size_t hash_value { return ... }

#else

    size_t hash() const { return ... }

#endif
};

} // namespace test

#ifdef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP

namespace boost
{

size_t hash_value( test::custom const & x )
{
    return x.hash();
}

} // namespace boost

#endif

This is a bit more verbose but demonstrates the recommended approach for the
two major cases, with or without argument dependent lookup.

Thanks for reading this far,

--
Peter Dimov
http://www.pdimov.com 

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk