Re: [boost] [bloom] Benchmarks with Knuth multiplier-based hash production

9 Jun 2025


      El 08/06/2025 a las 17:13, Ivan Matek escribió:
...
On Sat, Jun 7, 2025 at 8:05 PM Joaquin M López Muñoz via Boost 
<boost@lists.boost.org> wrote:
Anyway, why don't you run it locally and play with the #pragmas?
Because when I quickly go to benchmark something 9 hours later I am 
just quickly benchmarking something :)
Also assuring reproducibility is pain, e.g. I do not have unused 
machine on which I can SSH into, to avoid my browser use or random 
background process messing with benchmark, especially considering 
bloom uses L3 cache a lot.
Hey, thanks so much for running the benchmarks! Yes, variance hurts
analysis. I'm plannning to move my GHA-based benchmarks to dedicated
machines so that results are more stable.
...
Besides, I'm interested in results outside my local machine and GHA.
    You just have to compile this in release mode (note the repo branch):
https://github.com/joaquintides/bloom/blob/feature/alternative-hash-producti...
 Well it was more complicated since I already have modular boost on my 
machine so I had to do some hacks to get CMakeLists.txt to work and 
also benchmark did not have CMakeLists.txt, and also I did use 
march=native, mtune=native instead of what your scripts do...
But to quickly recap:
1. There seems to be no unrolling happening without me doing it with
    pragmas.
 2. I have increased constants to reduce chance of noise affecting
    results:
    -  static const int              num_trials=10;
    -  static const milliseconds     min_time_per_trial(10);
    +  static const int              num_trials=20;
    +  static const milliseconds     min_time_per_trial(50);
 3. I did this to make tables more aligned:
    -    "<table>\n"
    +    "<table style=\"font-family: monospace\">\n"
 4. In terms of benchmark setup I would add 5% of "opposite"
    lookups(e.g. success in failures) since I presume current setup
    does not penalize branchy code as realistic scenarios
    would(although it is possible real code might also might have
    close to 100% of successes or failures). Just to be clear: I did
    not make this change.
 5. I would suggest to to consider switching benchmark repo to use
    native instead of mavx2
So, unrolling does not happen, this is out of the way, thanks for 
investigating.
I'll use -native as you suggest. As for the difference between the original
hash production scheme and the one proposed by Kostas (cells marked
with *), numbers are not very conclusive, but looks like Kostas's approach
incurs a slight degradation in execution time. I hope we can see this more
clearly with the upcoming GHA benchmarks on dedicated machines.

Joaquin M Lopez Munoz