Boost logo

Boost :

From: Matt Borland (matt_at_[hidden])
Date: 2024-12-09 17:05:48


> Vinnie Falco asked me the following on Slack:
>

> > I would ask, what is the motivating use-case for calling result
> > twice? This is not explained in the docs and no examples are
> > given. In fact, the one example given says "not to do this"
>

>

> Calling result() twice (or more times) provides result extension;
> the ability to extract variable number of bits from a hash
> algorithm, instead of a fixed size value (e.g. 64 bit.)
>

> This is in fact stated in the docs here
>

> https://pdimov.github.io/hash2/doc/html/hash2.html#hashing_bytes_result
>

> > Note that result is non-const, because it changes the internal
> > state. It’s allowed for result to be called more than once;
> > subsequent calls perform the state finalization again and as a
> > result produce a pseudorandom sequence of result_type values.
> > This can be used to effectively extend the output of the hash
> > function. For example, a 256 bit result can be obtained from a
> > hash algorithm whose result_type is 64 bit, by calling result four
> > times.
>

>

> and there is an example of doing that here
>

> https://pdimov.github.io/hash2/doc/html/hash2.html#example_result_extension
>

> All hash algorithms are required to support result extension,
> because (in my opinion) this is extremely useful functionality
> that is easy - even trivial - to provide, but is often withheld
> either by accident or in some cases, even deliberately.
>

> Hash algorithms typically have a "finalization" phase that
> pads the message, mixes the length, scrambles the internal
> state in a more thorough manner than in `update`, and then
> derives a hash value from that state. (The hash value is often
> shorter than the total amount of state.)
>

> If this "finalization" phase is performed more than once, one
> naturally gets the mandated `result()` behavior.
>

> Falco continues:
>

> > I pointed out in the post I already made that the quality of
> > digest from calling result twice is dependent on the hash
> > algorithm, and there is no way the library can provide
> > assurances on the quality
>

>

> That's of course correct, but it also applies to the quality of
> calling `result()` only once; it's naturally dependent on the
> implementation of the hash algorithm.
>

> What's important here is that it's not possible to provide
> an extended result of better quality from the outside; the
> hash algorithm is in the best place to provide it because it
> has access to more bits of internal state than it lets out.
>

> This requirement effectively mandates that all hash
> algorithms be extendable-output hash functions:
>

> https://en.wikipedia.org/wiki/Extendable-output_function
>

For those unfamiliar with Extendable Output Functions (XOFs) FIPS 202 [1], and the reference implementation [2] provide good detail since the wiki article seems a bit short.

Matt

[1] https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf
[2] https://github.com/XKCP/XKCP/blob/master/usage-example.md





Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk