Subject: [boost] dynamic_bitset count
From: Andreas Klein (klein_at_[hidden])
Date: 2009-03-16 08:34:05
Dear boost developers,
I checked the implementation of boost::dynamic_bitset::count
and find that it uses a table look up.
This is not optimal.
First on 64-bit machines divide-and-conquer algorithms are faster than
table look up and save even memory.
Second if the bitset contains several words, there are for both 32 and
64 bit machines population count algorithms that are faster than
the simple loop over all words used by boost.
Together with Y. Edel I developed a new version,which saves 60%
against the simple algorithm. We put the code online
(http://cage.ugent.be/~klein/popc.html). A disadvantage of our
implementation is that we relay on aggressive optimization (-O3) of
If you are searching a compromise between speed and code
complexity. The first iteration of the Harley-Seal-method (25% faster
than the simple loop) would be a candidate.
You find this and other population count algorithms on our homepage, too.
If there is interest I can add one of the advance algorithms in
Boost list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk