Boost logo

Boost :

From: Peter Dimov (pdimov_at_[hidden])
Date: 2024-12-08 19:43:47


Ivan Matek wrote:
> Hi everyone,
>
> Implementation uses uint64_t in which it sums into hashes of particular
> elements(each computed with "fresh" hasher so ordering does not matter),
> then it finally hashes that sum and range size into original passed hasher.
>
> I see no problem with logic this since it cleverly exploits the fact that uint64_t
> overflow is defined and that sum is commutative.
>
> But I wonder if this preserves the cryptographic quality of original hash.
> For example if we use sha2 it seems to me that finding collisions becomes
> easier since sum is "only" 64 bits, and length is "only" 64 bits(or 32 on some
> systems), and may be even less if attacker knows unordered range was not
> super long, e.g. number of items was less than 10000.

Ideally I'd use 128 or 256 bit integer there, but using __uint128_t portably is
still a pain, and hashing unordered containers is a very niche use case. You'd
probably never encounter it in practice... well, except for boost::json::object,
I suppose.

Note that if you have things in the message preceding the unordered range,
this effectively serves as a seed for the per-element hash function, which
makes engineering collisions harder.

Still, 64 bits are 64 bits. 128 would have been a good compromise between
speed and security, but std::uint128_t is not here yet.

https://github.com/cplusplus/papers/issues/1793

(Please don't leave comments on the above issue, it's not for discussion,
but for tracking paper progress.)

Matt Borland is considering proposing boost::uint128_t to Core, but that
would mean depending on Core.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk