|
Boost : |
Subject: Re: [boost] [dynamic_bitset] Using of GCC built-in functions may increase performance of some methods
From: Gavin Lambert (boost_at_[hidden])
Date: 2018-08-23 02:08:40
On 23/08/2018 09:16, Andrey Semashev wrote:
> I think such an optimization would be useful. Note that MSVC also has
> intrinsics for popcount[1], although I don't think those are supported
> when the target CPU doesn't implement the corresponding instructions.
> You would have to check at compile time whether the target CPU
> supports it (e.g. by checking if __AVX__ is defined).
While compile-time detection is better, if you can do it (because it
lets it be completely inlined); if the compile-time detection fails, you
can still do runtime detection, eg. by defining something like:
// header file
extern int (*popcnt64)(uint64_t);
// source file
static bool is_popcnt_supported()
{
int info[4] = { 0 };
__cpuid(info, 1);
return (info[2] & 0x00800000) != 0;
}
static int popcnt64_intrinsic(uint64_t v)
{
return /* _mm_popcnt_64(v) or __builtin_popcountll(v) */;
}
static int popcnt64_emulation(uint64_t v)
{
// code that calculates it with bit twiddling
}
static int popcnt64_auto(uint64_t v)
{
popcnt64 = is_popcnt_supported()
? &popcnt64_intrinsic
: &popcnt64_emulation;
return popcnt64(v);
}
int (*popcnt64)(uint64_t) = &popcnt64_auto;
Repeat for other argument sizes as needed. You could probably do
something fancier with C++11 guaranteed static initialisation, but this
will work on all compilers.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk