Boost logo

Boost :

From: Thomas Wenisch (twenisch_at_[hidden])
Date: 2002-06-17 14:09:49


Has any thought been given to adding a library to support sparse bitsets
as well as dense bitsets supported in dyn_bitset and in the standard
bitset?

Dense bitsets use 1 bit of storage for each set element, and thus are very
space efficient if the bitset has a fairly even distribution of zeros and
ones.

However, there are some applications where the distribution of zeros and
ones (or present in the set vs. not present in the set) are heavily
skewed. For example, only 10 out of 10000 members of a set may be
present. In these cases, a sparse bit set can be more efficient in terms
of space and time for common operations (set union, set intersection,
etc). Sparse bitsets store the indexes of those set elements that are
present/absent from the set. The underlying representation is
generally something like vector<unsigned short>.

A dense bitset is O(log2(N)) in both storage and time for all
operations (where N is the size of the set). A sparse bitset is
O(M) where M is the number of elements present (or not present) in the
set. For large skewed bitsets, M can be less than log2(N).

I have found it useful in the past to have both sparse and dense
implementations of a bit vector with identical interfaces. Thus, the
appropriate bit_vector implementation can be chosen based on what the
programmer knows about the data distribution.

For example, in the source code to the SUIF compiler from Stanford, both
dense and sparse "natural sets" are included with identical interfaces,
and both are heavily used in the data-flow analysis of various
optimization passes.

Do others feel that sparse bit sets would be useful and appropriate with
an interface matching dyn_bitset?

I haven't reviewed the dyn_bitset implementation so I can't give an
opinion on it, but I did want to toss this idea out to see what people
thought.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk