Boost logo

Boost :

From: Joaquin M López Muñoz (joaquinlopezmunoz_at_[hidden])
Date: 2025-06-07 18:05:38


El 06/06/2025 a las 22:06, Ivan Matek escribió:
>
>
> On Fri, Jun 6, 2025 at 9:02 PM Joaquin M López Muñoz via Boost
> <boost_at_[hidden]> wrote:
>
> Feedback welcome,
>
> Just a general question about benchmarks, not this one in particular.
> If you are doing something in a loop(e.g. this one
> <https://github.com/joaquintides/bloom/blob/1f0f953196c8e16e1c5fd8f62e18f4883dbcd44e/benchmark/comparison_table.cpp#L161>)
> did you check compiler did not unroll it in one case and not in another.
> You might say this is "natural" in a sense you did not tell compiler
> to unroll or not unroll the loop, but I would disagree partially since
> in real code it is unlikely that people will always iterate over an
> array of contiguous data, in real code they might get some data parse
> it, then do one lookup, get more data, parse it, do one lookup...
> I am not saying this is likely to happen, but may be worth checking.
> because long time ago I was playing around with benchmarking some hash
> map code and "weird" results that did not made sense were caused
> because compiler unrolled loop in one case, and not in another. After
> I did #pragma
> <https://releases.llvm.org/4.0.0/tools/clang/docs/AttributeReference.html#pragma-unroll-pragma-nounroll>
> to disable loop unrolling results made sense.

I haven't examined the codegen so I can't tell for sure. My intuition is
that loop unrolling won't take place because the body of the loop is
not trivial and unrolling would add a lot of pressure to the instruction
cache and take up too many registers for effective pipelining.

Anyway, why don't you run it locally and play with the #pragmas?
Besides, I'm interested in results outside my local machine and GHA.
You just have to compile this in release mode (note the repo branch):

https://github.com/joaquintides/bloom/blob/feature/alternative-hash-production/benchmark/comparison_table.cpp

and execute with

./a.out 1000000

(for 1M elements, you can try 10M elements or any other value as well).
Thank you!

Joaquin M Lopez Munoz


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk