Boost logo

Boost :

Subject: Re: [boost] [dynamic_bitset] Using of GCC built-in functions may increase performance of some methods
From: Gavin Lambert (boost_at_[hidden])
Date: 2018-08-23 02:08:40

On 23/08/2018 09:16, Andrey Semashev wrote:
> I think such an optimization would be useful. Note that MSVC also has
> intrinsics for popcount[1], although I don't think those are supported
> when the target CPU doesn't implement the corresponding instructions.
> You would have to check at compile time whether the target CPU
> supports it (e.g. by checking if __AVX__ is defined).

While compile-time detection is better, if you can do it (because it
lets it be completely inlined); if the compile-time detection fails, you
can still do runtime detection, eg. by defining something like:

// header file
extern int (*popcnt64)(uint64_t);

// source file
static bool is_popcnt_supported()
     int info[4] = { 0 };
     __cpuid(info, 1);
     return (info[2] & 0x00800000) != 0;

static int popcnt64_intrinsic(uint64_t v)
     return /* _mm_popcnt_64(v) or __builtin_popcountll(v) */;

static int popcnt64_emulation(uint64_t v)
    // code that calculates it with bit twiddling

static int popcnt64_auto(uint64_t v)
     popcnt64 = is_popcnt_supported()
         ? &popcnt64_intrinsic
         : &popcnt64_emulation;
     return popcnt64(v);

int (*popcnt64)(uint64_t) = &popcnt64_auto;

Repeat for other argument sizes as needed. You could probably do
something fancier with C++11 guaranteed static initialisation, but this
will work on all compilers.

Boost list run by bdawes at, gregod at, cpdaniel at, john at