Boost Users :

Date view	Thread view	Subject view	Author view

From: DrXenos (spamjunk_at_[hidden])
Date: 2002-08-16 21:53:27

Next message: Eric Niebler: "Re: regex performance"
Previous message: Peter Dimov: "Re: [Boost-Users] detail/lightweight_mutex.hpp"

I was wondering if anyone would be interested in a Set Library,
similar to Pascal's set type for ordinal types. I want to make sure
someone would fine it useful (and not trivial, done-to-death, etc.)
before I take the time to "Boostify" it and get a release from my
employer.

A while ago I needed a set type that was compact, fast, and avoided
heap allocation. The number of elements was constant for a particular
set type, and the element type could be any integral type (primarily
enums). Similar types from STL are set, vector<bool>, and bitset, but
none quite suited the application. My set type (Set) stores elements
at the bit level, like vector<bool> and bitset, but it allows the
creation of compile-time set constants. Also, with a good optimizing
compiler, operations on constant sets will be done at compile-time,
though as of yet, there is no direct support for guaranteeing this in
the code. The class provides the usual set operations.

For efficiency, set operations are done in parallel groups of the
machine's word size. So, set types whose max. number of elements is
less than or equal to the machine's word size, will have most of its
operations done in constant time (usually one machine instruction per
operation). Sets whose max. number of elements is greater than the
machine's word size will be done in linear time, usually in E / Wb
cycles, where E is the max. number of elements and Wb is the machine's
word size in bits. Though, with a good compile (through inlining and
loop unrolling), this linear time would only be E / Wb instructions.
On an average 32-bit processor, that's just 8 instructions to perform
the union or intersection of a set of 256 elements. Originally, I
avoided loops and used metatemplates to inline the operations, but
this resulted in very long compile-times without any gains in
performance! And sometimes it resulted in worse performance. That
will teach me not to try and second guess today's optimizing compilers!

As I stated above, the class template may be instantiated with any
integral type, but it was primarily meant for enumerations with will
give you the added bonuses of type-safely and symbolic names for the
elements. The class assumes that the range given is not disjoint
(i.e., 0-63 and not 0 - 16, 19, 63). You may use disjoint ranges, but
your sets will contain wasted space. The only requirement is that the
range of values is greater than or equal to zero, and that the element
type has a representation for -1 (used as a (hidden) sentinel in
argument lists of elements). You could even conceivably use a class
that was "castable" to int.

For example:

// element type
enum Color {red, orange, yellow, green, blue, violet, NumColors,
NilColor = -1};

// type for set of colors
typedef Set<Color, NumColors> ColorSet;

// set constant
const ColorSet rgb = ColorSet::SetOf<red, green, blue>::value(); //
this is done at compile-time

void foo()
{
    // construction
    ColorSet s = ColorSet::SetOf<orange, yellow, violet>::value(); //
this is done at compile-time
    ColorSet s1 = ColorSet(a); // this *may* be at compile-time
    ColorSet s2 = ColorSet(orange, yellow, violet); // this is NOT
done at compile-time

// (but allows non-constant elements)

    // operations, some of which *may* be done at compile-time by a
smart compiler
    ColorSet a = s; // assignment
    ColorSet b = s + rgb; // union
    ColorSet c = s * rgb; // intersection
    ColorSet d = s - rgb; // difference
    ColorSet e = s / rgb; // symmetric difference
    bool b1 = s == rgb; // equality
    bool b1 = s != rgb; // unequality
    bool b3 = s >= rgb; // superset
    bool b4 = s > rgb; // proper superset
    bool b5 = s < rgb; // subset
    bool b6 = s > rgb; // proper subset

if (b.contains(red)) // test for set membership
std::cout << "b contains the element red" << std::endl;

    if (red == b) // same as above. can anyone suggest a better
operator?
                      // maybe use red -> b and use NULL for false?
                      // or maybe remove and just use b.contains(red)?
        std::cout << "b contains the element red" << std::endl;
}

What do you think?

Thanks,

Rich
(spamjunk_at_[hidden])

Next message: Eric Niebler: "Re: regex performance"
Previous message: Peter Dimov: "Re: [Boost-Users] detail/lightweight_mutex.hpp"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net