|
Boost : |
From: Peter Dimov (pdimov_at_[hidden])
Date: 2024-12-03 00:54:09
Ivan Matek wrote:
> 2) From what I see there is no way for algorithm to say it only works on
> aligned data, e.g. to avoid runtime check of data alignment because SIMD
> instructions?
> E.g
>
> void update_aligned<32>( void const* data, std::size_t n );
> Would this be too much complication for too little performance gain(not to
> mention UB risk)? Not an expert, just know performance was motivation for
> std::assume_aligned <https://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p1007r1.pdf>
Since most hash functions process the input in blocks of some fixed size, the
structure of the `update` function is generally
if( we have an incomplete block from last update )
complete block from the input and process it
process however many complete blocks we have
store the remainder for the next call to update
Even if the input is aligned at the call to update, there's no guarantee it will
remain aligned after the first step, so nothing would be gained. The function
will still have to check for alignment at the beginning of step 2 in either case.
Not that we have any SIMD optimizations at this point, but even if we had,
it still wouldn't be worth complicating the interface for that.
> 4) Paper mentioned is quite old, but if you remember... do you have some
> feedback paper received at the time?
I didn't follow the committee at that time too closely, so that's more of a
question for Howard Hinnant (if he still reads here) or Vinnie Falco.
My informed guess is that when N3980 was submitted, there was a
competing Google proposal
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3333.html
and, since the committee doesn't like to pick between competing proposals,
its usual strategy is to tell authors to work it out between themselves.
I can see that there's a later Google proposal
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0029r0.html
that is supposedly an unification, but it doesn't seem to have moved forward
either, and I don't know why.
Either way, N3980 is what I liked best, so that's what I based the library on.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk