Boost logo

Boost :

From: Claudio DeSouza (cdesouza_at_[hidden])
Date: 2024-12-04 01:09:21


Hello.

My review is the summary of a few discussions I've had with Peter and
Christian regarding the API design being employed for hash2. I'm aware
hash2 is inspired by one of Hinnat's proposals to standardisation that
didn't come to fruition. I would like first to commend the authors for
their work on this project. Having good libraries for this type of task is
important, especially when the big names like OpenSSL provide not-so-great
interfaces. Having said that, I have a few thoughts on what this library
may be missing.

*Spanification*
There's a significant change in mood with regards to safe library design
taking place in the industry. With clang, we now
have -Wunsafe-buffer-usage, and this warning has been continuously improved
recently. For reference https://clang.llvm.org/docs/SafeBuffers.html
When I was going through this library's code I was left with a frustrating
thought that boost should be at the forefront of this type of innovation:
safe, and intuitive library design. Each function that takes a pair of
void*/size_t should be a span<uint_8>. The model of how to approach these
areas is a solved issue at this point.
A robust span type with intuitive extensions would have matching
constructors for all sorts of contiguous ranges, including native arrays,
which would make using this library much simpler. It would also afford the
necessary compile-time validations that you get with static extents in a
span. At the current state, hash2 relies on a lot of pinky-promise
guarantees that we tend to live with in C++, but that in fact can be
avoided without much (or any) overhead.

*Digest type*
Due to limitations to how much of std::array is constexpr in different C++
versions, this library offers its own digest type. However, this just wraps
a stack array. I feel that rather than returning a digest that is stack
allocated, it would have been better for algorithms to take a fixed size
span where the digest can be written to. I say this because in many cases a
stack allocated array will be fine. However a mutable fixed span would
provide more flexibility about the storage, for instance, a vector. There's
also the matter of hardening (https://libcxx.llvm.org/Hardening.html) which
one loses by using this particular digest type, rather than storage
provided by a standard type. But I would also like to mention that sha3 has
a variable digest size, so again, another reason why an out param mutable
span would be a more flexible choice on the whole.

*Boost Identity*
I have been reluctant to write a review because it seems unfair to demand
of hash2 to provide robust spanified interfaces, when no other library
before was required to do so. However, at the same time, considering that
this type of code tends to be used in safety-sensitive contexts, I fear
this library may fail to achieve its potential, when a lot of its
interfaces are unsafe. if I had to bring this library to any project I'm
working on, I would have to wrap it into safe spanified interfaces, to hide
away all these pointer/length pairs around, before I could use it in my
code. However, I understand that to produce a spanified design, boost would
have first to introduce a mature span implementation + extensions that can
be leveraged for this type of code design.

*Security*
I think for things like HMAC, there is a question of allocators that
zero-out memory, of secure obfuscated memory content in case of crash
dumps, etc, that this library has not been attentive to, so I thought it
was worth mentioning.

*In conclusion*
I'm here to support the authors. I don't feel like voting either accept or
reject, because I think Boost as a whole has not provided the mission
critical answer to that. I want this library to succeed. I just wish it was
a bit more ambitious in its goal on the library-design space.

Claudio.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk