Boost logo

Boost :

From: Alexander Grund (alexander.grund_at_[hidden])
Date: 2024-12-11 16:03:13


My review (a bit longer than intended)
> - What is your evaluation of the design?
The library is based on the paper which goes into great lengths
discussing the design.
To me this sounds well reasoned so the library design is solid.
> - What is your evaluation of the implementation?
The whole code base looks very clean and well structured.
The implementation of the provided hash functions follows their
specification making them easy to verify.
Reasonable use of trait-types and C++11/14 features and SFINAE-friendly
methods make it look solid to me.

The only point of criticism I have is the type used for "byte-ranges" in
the HashAlgorithm constructor and update method(s).
In the (seed-)ctor the bytes are of type `unsigned char const` while in
`update` the are of type `void const`.
If both methods are supposed to receive a byte sequence I'd expect them
to use the same type and am confused by the difference.
Given that the constexpr-enable concept uses `unsigned char const` I'd
choose that.
Use of a span-like type has been extensively discussed and I'll just
mention my personal key takeaway:
In the context of static analysis through compiler warnings I'd like to
see support for both `update(ptr, size)` & `update(byte-span)`
in the concept to give HashAlgorithm implementers the freedom to not
provide a signature that triggers compiler warnings.
Support for a span-like type at least constructible from a `unsigned
char const*` and size to be passed to the algorithms update function
seems reasonable to me.
However I do see the advantage of the more flexible approach currently
implemented. Especially due to the multitude of span classes available.
So to me this is optional and at most a QoI improvement.

> - What is your evaluation of the documentation?
Well written and structured.
However I find it odd, that it seems to emphasize the provided hashing
algorithms and still calls itself "An Extensible Hashing Framework"
For me it also dives too soon too deep into details. I'd like to have
the distinction more clear between:
a) hashing (raw) byte sequences by using a HashAlgorithm
b) hashing any C++ types/objects by using the framework

As a (potential) user I'd really like to be told first how to use this
framework with something existing (already implemented hash algorithm
class) than
having an in-depth explanation of the (class-)concepts and of a
multitude of provided algorithms, a simple list with links to details at
this point (the introduction) might suffice.

To me the "major use case" the library addresses is not "hashing an
untyped sequence of bytes".
If I understood it correctly the algorithms provided are "just" examples
of algorithms the framework can be used with.
There are already implementations of such algorithms and some are
seemingly much faster (compare the OpenSSL benchmarks posted on the list).
The novelty to me here is a framework to allow the separation of types
and hash algorithms, i.e. enable types to be hashed without knowing by
which algorithm.
So maybe that should be more prominent in the docs.

I also find the description of the `result()` function questionable:
> Note that result is non-const, because it changes the internal state.
[...] subsequent calls [...] produce a pseudorandom sequence

Having this in the description of the HashAlgorithm concept seems to
exclude any algorithm that doesn't has a meaningful "finalize" operation.
Or those that "finalize" e.g. by padding and hence subsequent "finalize"
calls are no-ops yielding the same result.

If it isn't an intentional requirement for all algorithms to be used by
this, I'd suggest to change the wording to e.g. "may change internal
state." and "Some might producea pseudorandom sequence" or similar.

> - What is your evaluation of the potential usefulness
> of the library? Do you already use it in industry?
Very useful, especially as it is an implementation of a paper for a
standards proposal which I think is very much in line of what Boost
is/was/should be.
I don't use it yet.

In the context of cryptographic security mentioned by others I'd like to see
a) an example how to wrap an existing, 3rd-party hash implementation
(which might only have a C interface) into the HashAlgorithm Concept
b) at least a statement on the security of the provided cryptographic
hash algorithms in the documentation.

> - Did you try to use the library? With which compiler(s)? Did
> you have any problems?
Tried examples using GCC 13, no issues.
> - How much effort did you put into your evaluation?
> A glance? A quick reading? In-depth study?
Something short of an in-depth study
> - Are you knowledgeable about the problem domain?
Somewhat, yes.
> Ensure to explicitly include with your review: ACCEPT, REJECT, or CONDITIONAL ACCEPT (with acceptance conditions).
CONDITIONAL ACCEPT
I'd like to suggest to reconsider the concept of the `result` function
(like the split into `result() const` and `finalize()` mentioned on the
list) or at least the wording,
and the types for "byte-ranges" mentioned above unless changing that
type severely affects the functionality (negatively).
Having the concept defined on inclusion might make it hard to change
later on when people rely on it so I'd rather like to have this "right"
from the start, if in doubt be more permissive.
This "condition" is intentionally very weak as I do like to see this
library include, and can be met by simply stating that the values
returned by multiple invocations of `result` might be the same.




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk