Boost :

Date view	Thread view	Subject view	Author view

From: Ivan Matek (libbooze_at_[hidden])
Date: 2024-12-05 21:18:16

Next message: Andrey Semashev: "Re: Formal review request for Hash2"
Previous message: Vinnie Falco: "Re: [hash2][review] Early review (due to holidays)"
In reply to: Peter Dimov: "Re: Formal review request for Hash2"
Next in thread: Andrey Semashev: "Re: Formal review request for Hash2"
Reply: Andrey Semashev: "Re: Formal review request for Hash2"

On Wed, Dec 4, 2024 at 9:07â€¯PM Peter Dimov <pdimov_at_[hidden]> wrote:

>
> It is in theory, but in practice you'd see much bigger gains if you make
> sure
> your class has no padding and mark it as trivially equality comparable.
>
> vector<MyStruct>, for example, will then use a single update call for the
> entire
> element array.
>
>
> Yes I understand that, and to be clear I think it is amazing optimization,
not to mention that if we get reflection in C++26 you will be able to do it
even more aggressively. But until reflection in some cases that would mean
populating a dummy fields with zeros or any other value to make struct
nopad(as uninitialized data being hashed is UB). Sure it is possible, but
requires extra work from user.

Speaking of performance I spent some time benchmarking a performance
"problem" with appending .size(). I say "problem" since I barely managed to
construct a benchmark where it shows, but it is not general case.
I will first provide long introduction and description of what I did, if
you are in a hurry you can jump to tl;dr recap bellow.

*What is 7 bytes among friends?*

Discussion earlier made it clear that in general case we must append size
after we hash the content on non fixed size range.
Reason for that is that as Peter explained
"ab" | "c" can not produce same hash as "a" | "bc". So we do something like
"ab" | 2 | "c" | 1 and "a" | 1 | "bc" | 2
Pipe here is separator between passing bytes to hash update.
Another reason is empty ranges, e.g. empty string without passed size would
not push any bytes to hash.
So appending size is good in generic case.
But this is not required in specific cases. Library currently very cleverly
uses that certain ranges are fixed in size, e.g. std::array ( when size() >
0, as mentioned before there is special handling of empty range).
You can see example here <https://godbolt.org/z/b6v9b6e19>(need to scroll
down to main), but long story short is that if range is fixed size size is
not encoded as it is not necessary.

But another common case, even explicitly mentioned in documentation
examples is when we are just hashing one value to obtain value for hash of
some unordered container.
Here issue with "ab", "c" hashing same as "a", "bc" does not exist since we
will only ever be hashing one entire string at the time and immediately
using that result, then dropping the hasher.
In other words we can hash just "ab" instead of "ab", 2. We need to be
careful to always hash something(e.g. for empty string we still must call
append with something), but there is no need to hash .size()
I have put together a small benchmark that was designed to to find the
worst case for appending size versus not doing it, and I managed to get
performance difference.
Now situation is a bit complex since clang has violent performance swings
depending on if he decides to inline something or not, but I managed to get
a small but probably real difference resulting from "saving" of 7 bytes.

To spam with large tables I will just recap results:
I have used only fnv1a(results are very similar for 32 and 64).
Despite my expectations(that static_string will benefit more than
std::string from not hashing size since it is simpler data structure) not
hashing size actually made hashing of std::string almost 2x faster.
Now this result is obviously just because clang decided to stop inlining
and is not fully realistic, but in general it is true that the bigger your
function is it is less likely that it will be inlined. I am not saying this
will happen often but it can happen.
For static_string speed benefits were small but noticeable.
e.g.
for strings of this lengths: min/max len: 5/14
12459ns
vs
13284ns

Results could obviously be made more dramatic by making strings shorter
min/max len: 3/8
6055ns
6937ns

or less dramatic by making strings longer
min/max len: 11/20
21622ns
22571ns

As expected even though measuring exact ns is impossible it is clear that
the lower in average string len we go as expected performance difference is
larger since we have a fixed difference of 7 bytes(on my machine the
difference in times also seems approximately fixed for small test set).
I did not benchmark scenario where hash is used in for example flat_hash_map
.
I did not paste std::string results since as mentioned they are mostly
influenced by clang inlining decisions, but you get the idea: there seems
to be small but real performance difference when hashing size. I checked
what some other popular C++ non std library does for std::string_view when
it is implementing replacement for std::hash and it also hashes size, so I
presume it is not unreasonable that boost does same, but I wonder if we
could do something faster.
There is also theoretically an issue that goes beyond hashing of single
value, e.g. if you have std::pair<int, std::string_view> or std::pair<int,
MyStruct> here you could also skip hashing string_view size /use optimized
hash of MyStruct(based on some type trait), but that case seems much less
frequent and harder to specify due to lack of reflection.

So to just stick to theoretical helper boost::hash2::hash_key:

If my results were not just some silly benchmark mistake would there be a
benefit of having boost::hash2::key_hash or some helper like that that does
not hash size, and immediately returns size_t value(so user has no way of
messing up by doing another call to hash_append)
Here is noncompiling godbolt <https://godbolt.org/z/f6ETzczz7> link. I do
not expect people to go over code and debug why clang decided to not inline
something or why some simple tricks(call with fixed n of 8 causes unrolling
on my setup) on my machine produce better performance, it is just here for
people who may suspect that I made some obvious mistake and want to double
check.

tl;dr
hashing size of non const size range is good general approach, but I wonder
if for best performance there needs to be special case hasher useful for
unordered containers. It will know for some types such as std::string,
std::string_view, std::pair<int, std::string_view>, std::vector<int>,... to
not hash size, while for some it will have to hash sizes, e.g. std::pair<
std::string_view, std::vector<int>>.

Next message: Andrey Semashev: "Re: Formal review request for Hash2"
Previous message: Vinnie Falco: "Re: [hash2][review] Early review (due to holidays)"
In reply to: Peter Dimov: "Re: Formal review request for Hash2"
Next in thread: Andrey Semashev: "Re: Formal review request for Hash2"
Reply: Andrey Semashev: "Re: Formal review request for Hash2"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk