Boost logo

Boost :

Subject: [boost] Add an adapter to cover fast bit functions
From: Evgeny Shulgin (izaronplatz_at_[hidden])
Date: 2018-08-26 23:32:10

Hi all, some days ago I opened a thread about DynamicBitset[1] (where I
noted that a small piece of code may double the speed of a function), which
received some interesting answers (you guys rock!)

I looked deeper and realized that Boost doesn't have functionality to do
common bit operations *really* fast.
Let me explain - there are several places in Boost where we needed to solve
typical tasks connected with bits in integers - like "get log2 of an
integer rounded down" or "get the number of trailing zeros" or "count
number of 1-bits", and Boost contributors have been inventing there a wheel
everywhere, with varying degrees of optimality: [2] - [6].

Moreover, if a programmer writes a cross-platform app and has to do smth
with this stuff, it's likely that for bit operations he will write a
"naive" implementations - e.g. getting the log2(x) for O(log N), when it
could be done for O(log log N), but with a more complicated code.

Also, as well as "smart" algos, there are other methods to use - precalced
tables, taking a small additional memory when needed [6], and intrinsics
(that are CPU-specific).
Combining of these methods (intrinsics when conveniently, or tables/algos)
gives a notable shift in performance - here's my benchmark [7] from my tiny
library [8]. Performance here varies from *9.2%* to *36.9%* of speed of
related naive functions.

I can tell you a usercase - as a hobby I'm writing a chess variants engine
(where bitsets used to represent chess board, and its size is *not* fixed
to 64), and bit operations' speed there is a matter of life and death.

So, what do you think of an adapter header in Boost, covering all these
operation of all integer types (and maybe some other one-liners like
`is_bit_set(x, pos)` ), which keeps programmers out of platform specifics?
It may look as smth like this [8] (I wrote only 3 functions, but it's
enough to show the potential).

I guess it can be placed somewhere to *Boost.Algorithm*, and supported with
CI build tests of different platforms.

P.S. The (incomplete) list of bit functions, that will give a *notable*
better performance compared with naive algos and even some "smart" (because
of precalc/intrinsics):
- count of bits
- parity of bit count
- first 1-bit
- last 1-bit (aka integer_log2(x))
- leading zeros
- trailing zeros
- swap(x) - returns x with the order of the bytes reversed; for example,
0xaabb becomes 0xbbaa. (bytes here are 8 bits exactly)
- rotate_left(x, shift)
- rotate_right(x, shift)
Other functions may look like `is_set(x, pos) -> return x & (1 << pos)` and
aren't so interesting.

(really almost the whole file)

Boost list run by bdawes at, gregod at, cpdaniel at, john at