Boost logo

Boost :

From: Peter Dimov (pdimov_at_[hidden])
Date: 2024-12-13 16:54:56


Antony Polukhin wrote:
> CONDITIONAL ACCEPT. The behavior of the default constructor must be
> secure.

Hi Antony, thanks for the review.

I of course agree that HashDOS is a problem. The library documentation
points this out in several places, that's why SipHash is included and
almost no other non-crypto hashes are, and its use is recommended.

However, I don't agree that the default constructor of the hash algorithm
_concept_ is the correct place to attempt to make things secure (or in
practice, less insecure than before.)

First, this would place the responsibility on hash algorithm authors, which
doesn't seem correct, or reliable.

Second, hash algorithms aren't only used in hash tables, and injecting
nondeterminism in their default constructors seems user-hostile. The hash
algorithm adaptor, once it's supplied by the library, would be a better place
to do that, if we decide to do it.

Third, using a single process-wide seed is not good practice (as you yourself
observe.) It makes things more secure compared to unseeded use, but not
really secure. The correct approach is to use a random seed (preferably of
size 192 bits or more) that varies per connection, or per container.

One might in principle make the case that a default constructor shouldn't
be included, but I think that this, combined with the other requirements
that are already causing pushback, is way too much innovation for 2025.

> `tag_invoke` seems to be an overkill.

Consider the second example in

https://pdimov.github.io/hash2/doc/html/hash2.html#hashing_objects_user_defined_types

class Str
{
private:

    static constexpr std::size_t N = 32;

    std::uint8_t size_ = 0;
    char data_[ N ] = {};

// ...
};

and let N be something bigger, say 256.

Using your proposed hash_visit approach, there's no way to include
the range [data_, data_ + size_) efficiently in the message; you'd have
to use a per-character loop. This isn't good for performance.

To fix this, you'd need to extend the visitor interface so that it allows
the equivalent of hash_append_range, and then you'll arrive on
something isomorphic to the current approach, e.g.

visitor( hash_append_range_tag{}, data_, data_ + size_ );

In addition, this has the usual drawback of reserving the name
"hash_visit" in all namespaces, which is what the tag_invoke pattern
is meant to solve.

...

> With some math knowledge the attack becomes even more simple see
> https://orlp.net/blog/breaking-hash-functions/

All non-crypto hash functions except SipHash should be assumed broken.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk