|
Boost : |
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2024-12-09 17:39:30
On 12/9/24 19:43, Peter Dimov via Boost wrote:
>
> What's important here is that it's not possible to provide
> an extended result of better quality from the outside; the
> hash algorithm is in the best place to provide it because it
> has access to more bits of internal state than it lets out.
>
> This requirement effectively mandates that all _hash
> algorithms_ be _extendable-output hash functions_:
>
> https://en.wikipedia.org/wiki/Extendable-output_function
Only some hash functions are specified as extendable-output functions
(XOF). I mean "specified" as "in hash algorithm specification". The link
you posted says XOF is an extension and even lists a few examples of
functions that support it.
The fact that you can implement some hash functions such that the
implementation allows multiple finalization calls or even interleave
updates and finalization steps does not make that hash function a XOF.
That just a property of your particular implementation. A useful
property, but still beyond specification. A different implementation may
rightfully not support this property and be still compliant with the spec.
In my opinion, HashAlgorithm must support the latter implementation that
is compliant with the hash function specification and does not support
the digest extension. If you want to expose the XOF capability then
please create a separate concept, say HashAlgorithmXOF, and add a way to
detect whether a given algorithm supports result extension.
I'll add that XOF is supported by some implementations, but they are
also incompatible with the current HashAlgorithm concept. For example,
OpenSSL provides EVP_DigestFinalXOF, but it must also be called only
once. The difference from EVP_DigestFinal_ex is that EVP_DigestFinalXOF
accepts the size of the buffer to will with the digest. If you are going
to define HashAlgorithmXOF, please take existing implementations of this
feature into account.
> Note that this is not the only innovation that the proposed
> hash algorithm concept involves. All hash algorithms are
> required to support seeding from uint64_t and from an
> arbitrary sequence of bytes, which makes them effectively
> _keyed hash functions_ (or _message authentication codes_).
>
> Also note that the requirement that one can interleave calls
> to `update` and `result` arbitrarily makes it possible to
> implement byte sequence seeding (for algorithms that don't
> already support it) in the following manner:
>
> Hash::Hash( unsigned char const* p, size_t n ): Hash()
> {
> if( n != 0 )
> {
> update( p, n );
> result();
> }
> }
>
> Subsequent `update` calls now start from an initial internal
> state that has incorporated the contents of [p, p+n), and that
> has been "finalized" (scrambled thoroughly) such that the
> result is not equivalent to just prepending the seed to the
> message (as would have happened if the result() call has been
> omitted.)
The exact behavior of the hash algorithm's constructor is its
implementation details. It doesn't need to be specified in terms of
public update and result methods. And certainly, that one hash algorithm
supports this sort of operation ordering doesn't mean that all of them
should.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk