Boost logo

Boost :

Subject: Re: [boost] [dynamic_bitset] Using of GCC built-in functions may increase performance of some methods
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2018-08-23 09:45:20


On 08/23/18 05:08, Gavin Lambert via Boost wrote:
> On 23/08/2018 09:16, Andrey Semashev wrote:
>> I think such an optimization would be useful. Note that MSVC also has
>> intrinsics for popcount[1], although I don't think those are supported
>> when the target CPU doesn't implement the corresponding instructions.
>> You would have to check at compile time whether the target CPU
>> supports it (e.g. by checking if __AVX__ is defined).
>
> While compile-time detection is better, if you can do it (because it
> lets it be completely inlined); if the compile-time detection fails, you
> can still do runtime detection, eg. by defining something like:
>
> // header file
> extern int (*popcnt64)(uint64_t);
>
> // source file
> static bool is_popcnt_supported()
> {
>     int info[4] = { 0 };
>     __cpuid(info, 1);
>     return (info[2] & 0x00800000) != 0;
> }
>
> static int popcnt64_intrinsic(uint64_t v)
> {
>     return /* _mm_popcnt_64(v) or __builtin_popcountll(v) */;
> }
>
> static int popcnt64_emulation(uint64_t v)
> {
>    // code that calculates it with bit twiddling
> }
>
> static int popcnt64_auto(uint64_t v)
> {
>     popcnt64 = is_popcnt_supported()
>         ? &popcnt64_intrinsic
>         : &popcnt64_emulation;
>     return popcnt64(v);
> }
>
> int (*popcnt64)(uint64_t) = &popcnt64_auto;
>
> Repeat for other argument sizes as needed.  You could probably do
> something fancier with C++11 guaranteed static initialisation, but this
> will work on all compilers.

This code requires a separately compiled unit. It can be done in
header-only style, though.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk