![]() |
Boost : |
From: Joaquin M López Muñoz (joaquinlopezmunoz_at_[hidden])
Date: 2025-05-21 07:20:06
El 20/05/2025 a las 21:07, Ivan Matek escribió:
>
> [...]
>
> One more question:
> I have some handcrafted tests (where bloom filter is so small it fits
> in L1/L2 cache, and hit rate of lookups is 0%(beside false positives)
> ) and simd one is a bit slower than no simd for certain values of K.
> constexpr size_t num_inserted =10'000;
> constexpr double fpr =1e-5;
> constexpr size_t K =5;
> using vanilla_filter = boost::bloom::filter<uint64_t,1, boost::bloom::multiblock<uint64_t, K>,1>;
> using simd_filter = boost::bloom::filter<uint64_t,1, boost::bloom::fast_multiblock64<K>,1>;
> I presume that is expected since it is hard to make sure SIMD is
> always faster, but just wanted to double check with you that this is
> not a unexpected result.
> So to recap my question: If bloom filter fits in L1 or L2 cache is it
> best practice to check if SIMDÂ or normal version is faster instead of
> assuming SIMD always wins?
Benchmarks at
https://github.com/joaquintides/boost_bloom_benchmarks
show that the advantage of fast_multiblock64<K> with respect to
multiblock<uint64_t, K> is small for some compilers (Clang, VS)
and low values of K, and occasionally multiblock wins (though
these measurements come with a fair degree of noise). So, yes, I'd
profile to make sure. In the case of fast_multiblock32<K> vs.
multiblock<uint64_t, K>, the advantage of the former is much more
clear (note that multiblock<uint32_t, K> is not included in the benchmarks
because it does not get us anything with respect to
multiblock<uint64_t, K>).
Joaquin M Lopez Munoz
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk