Boost logo

Boost :

From: Dave Harris (brangdon_at_[hidden])
Date: 2005-03-15 11:20:14


In-Reply-To: <001201c527e4$7405efa0$6501a8c0_at_pdimov2>
pdimov_at_[hidden] (Peter Dimov) wrote (abridged):
> > In my experience, if you want two different things to hash the same,
> > you need to write their hash functions with that in mind. It is a
> > special case which justifies not using the default hash function.
>
> Maybe, but this is not an argument for making the hash values different.

True. Arguments for that have been given elsewhere. One argument is to
maximise variation (and for types like bool it is desperately needed).

Another is to allow treating of 32-bit values as bytes. For example,

    size_t hash_range( const int *first, const int *last ) {
        return hash_range(
            static_cast<const unsigned char *>(first),
            static_cast<const unsigned char *>(last) );
    }

may be a good implementation on some platforms. It does 4 times as many
mixing steps (assuming sizeof(int) == 4) as the proposed algorithm. It is,
I think, how the designer of the proposed hash_combine algorithm expected
it to be used.

However, it yields hash values for int * different to the values for char
*, so with the current design aims we can't use it.

> > But doesn't this mean that hash_value( pair<int,int> ) also needs a
> > seed argument, if you want it to match hash_range( int[2] ) ?
>
> No. The seed version of hash_range is used when you want to compute the
> hash function of a sequence, but you don't have access to the whole
> sequence at once. It's goal is not to provide a way to change the
> initial seed, but to make it possible to continue from an intermediate
> seed.

But suppose I have pair< pair<int,int>, pair<int,int> > ? Shouldn't that
yield the same hash value as int[4] ?

-- Dave Harris, Nottingham, UK


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk