Boost logo

Boost :

From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2024-12-10 14:13:53


On 12/10/24 16:09, Andrey Semashev wrote:
>
> The documentation should provide a more detailed description of the
> HashAlgorithm::result_type type. Specifically, what interface and
> operations it must support. Ideally, the library should provide a
> concept, similar to HashAlgorithm. Currently, the documentation only
> says it can be an unsigned integer type or a "std::array-like" type,
> where the latter is rather nebulous. From the implementation of
> get_integral_result it looks like it must support data() member function
> returning a pointer and that its size() after default-construction is a
> constant expression and returns at least 8. This, for example, excludes
> std::vector, which would be useful for HashAlgorithmXOF::result_type.
>
> Also on the topic of get_integral_result for "std::array-like" hash
> values, a few comments:
>
> - It requires the static size of 8 regardless of the size of the
> requested integer, so even if uint32_t is requested it will still
> require at least std::array<unsigned char, 8>. This should not be necessary.
> - It ignores the data beyond the initial 8 bytes. It might be ok for the
> purpose of slicing the hash value, but it is inconsistent with
> get_integral_result for integers, which attempts to mix bits through
> multiplications with a constant. I think, the implementation should be
> consistent between arrays and integers. The approach of universally
> mixing bits for all hash value types seems safer as it produces better
> results for hash values with non-uniform distribution of high-quality
> bits across the hash value. It might be possible to reuse one of the
> hash algorithms implemented in the library for this bit mixing.
>
> So I think, get_integral_result and HashAlgorithm::result_type should be
> better defined by the library. It should be possible for the user to
> provide his custom type as HashAlgorithm::result_type, and for that (a)
> there must be a list of requirements for that type to satisfy, and (b)
> it should be clear how an integer will be derived from it by
> get_integral_result. Maybe the user should be able to define his own
> get_integral_result behavior for his digest type. The reason why I think
> user-defined hash value types are useful is type safety, as it may be
> desirable to associate the digest with the algorithm that produced it.

Also, since I already suggested a few renames in my review, I think
renaming result_type to digest_type would make sense. result_type is
traditionally used as the result type of a function object, which
HashAlgorithm isn't. And the result of a hash function is commonly
called a digest.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk