|
Boost : |
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2024-12-07 01:29:11
On Fri, Dec 6, 2024 at 4:24â¯PM Ivan Matek via Boost <boost_at_[hidden]>
wrote:
> Goal was to "encourage" users in 2 ways to call result() only once. First
> way is that move signals that value is "tainted", second is that clang-tidy
> can detect double moves sometimes.
>
I propose to change HashAlgorithm requirements as follows:
--- HashAlgorithm result() is renamed to result_type finalize(); The current documentation for finalize [1] is moved elsewhere and replaced with the following text: This function shall return the final hash value of the input message, where the input message is defined by the ordered sequence of bytes provided in all prior calls to update(). The behavior of subsequent calls to finalize() is undefined unless specified by the HashAlgorithm. --- Rationale: The name "finalize" is closer to established practice and is more suggestive of the typical state mutation: https://github.com/openssl/openssl/blob/5fce85ec52a826d53665552b50e67f86c92dc394/include/openssl/sha.h#L76 The existing Hash2 documentation for result() is too specific and suggests operations which may not be relevant such as for FNV-1a or other byte-oriented algorithms. It would be better to state only the mandatory requirements and leave the rest to the implementation. The existing Hash2 documentation suggests the possibility of calling finalize() twice or more in a row, yet this can only be considered safe, secure, or otherwise generally following best practices on a case by case basis depending on the HashAlgorithm. The HashAlgorithm concept is first and foremost designed for use with the Hash named requirement (std::hash and related). Making calls to a generic finalize() undefined does not hinder this use-case. There has been no research into whether HashAlgorithm should be held up as the concept for how ALL hash algorithms should be modeled. Therefore the named requirements for HashAlgorithm's finalize() function should only go as far as needed to support the Hash use-case, and we should leave the rest of its behavior up to the implementer. In particular I think there is danger here: template< typename HashAlgorithm > auto double_finalize( HashAlgorithm& h ) { h.finalize(); return h.finalize(); } This is dangerous because the usage in generic contexts results in an unpredictable quality of result. We should use the more strict definition I provided above for now, and only in the future loosen the definition if there is evidence that doing so yields a net benefit. It is always easier to go from strict to loose. And going from loose to strict after the fact is difficult and often impossible without breaking things. It is better if double_finalize is undefined. Users who want random numbers or whatever, can do it with a specific implementation of HashAlgorithm which offers additional guarantees for finalize(). [1] https://pdimov.github.io/hash2/doc/html/hash2.html#hashing_bytes_result
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk