Re: [boost] [bloom] Benchmarks with Knuth multiplier-based hash production

8 Jun 2025

      On Sat, Jun 7, 2025 at 8:05 PM Joaquin M López Muñoz via Boost <
boost@lists.boost.org> wrote:
...
Anyway, why don't you run it locally and play with the #pragmas?
Because when I quickly go to benchmark something 9 hours later I am just
quickly benchmarking something :)
Also assuring reproducibility is pain, e.g. I do not have unused machine on
which I can SSH into, to avoid my browser use or random background process
messing with benchmark, especially considering bloom uses L3 cache a lot.
...
Besides, I'm interested in results outside my local machine and GHA.
You just have to compile this in release mode (note the repo branch):
https://github.com/joaquintides/bloom/blob/feature/alternative-hash-producti...
Well it was more complicated since I already have modular boost on my
machine so I had to do some hacks to get CMakeLists.txt to work and also
benchmark did not have CMakeLists.txt, and also I did use march=native,
mtune=native instead of what your scripts do...

But to quickly recap:

   1. There seems to be no unrolling happening without me doing it with
   pragmas.
   2. I have increased constants to reduce chance of noise affecting
   results:
   -  static const int              num_trials=10;
   -  static const milliseconds     min_time_per_trial(10);
   +  static const int              num_trials=20;
   +  static const milliseconds     min_time_per_trial(50);
   3. I did this to make tables more aligned:
   -    "<table>\n"
   +    "<table style=\"font-family: monospace\">\n"
   4. In terms of benchmark setup I would add 5% of "opposite" lookups(e.g.
   success in failures) since I presume current setup does not penalize
   branchy code as realistic scenarios would(although it is possible real code
   might also might have close to 100% of successes or failures). Just to be
   clear: I did not make this change.
   5. I would suggest to to consider switching benchmark repo to use native
   instead of mavx2

my tests were of form:
taskset --cpu-list 0 {binary}  {number} >> {description}.html

cpu was i7-13700H, core speed was not locked, range between 3.2 and 3.8GHz,
it is possible avx code was affecting cpu speed, but did not check, could
be just accumulated heat.

flags:
FLAGS = -O3 -DNDEBUG -fcolor-diagnostics -march=native -mtune=native

I have attached 2 runs so you can see the noise of measurement on my
machine.
I have also attached one unrolled run, just to see it can cause difference,
but as I said this does not matter much since by default clang does not
unroll.

Re: [boost] [bloom] Benchmarks with Knuth multiplier-based hash production

Ivan Matek