Boost logo

Boost :

From: David Bergman (davidb_at_[hidden])
Date: 2002-08-19 19:16:50


The "algebraic" side of me definitely likes the operators "+" and "*". I
vote for those, but consider it to be quite important not to have both
sets, i.e., that you decide to go with either the "algebraic" or
"boolean" side.
 
/David
 
-----Original Message-----
From: boost-bounces_at_[hidden]
[mailto:boost-bounces_at_[hidden]] On Behalf Of spamjunk
Sent: Monday, August 19, 2002 7:09 PM
To: boost_at_[hidden]
Subject: Re: [boost] request for comment on possible submission to
library
 
Yes, I have inplemented all the shortcut assignment operators. I also
allow added elements as you suggested (with +=) or you can even do: s
<< e1 << e2...
[I may have gone a little overboard with the operators].
 
I like your suggestion for membership testing using <= and >=. What
does everyone else think?
 
I have been using + for union and * for intersection. You think | and &
would be better? in addition too?
----- Original Message -----
From: Victor A. Wagner, Jr. <mailto:vawjr_at_[hidden]>
To: boost_at_[hidden]
Sent: Sunday, August 18, 2002 1:03 PM
Subject: Re: [boost] request for comment on possible submission to
library
 
I, for one, would like to see it.
The topic kind of came up recently with respect to the "relational
operators" and dynamic_bitset.

I hope you will also add the operators +=, -=, *= and /=
since the C/C++ operators | (for union) and & (for intersection)
I would suggest that perhaps:
b += red;
should be legal, in addition to the more clumsy
b += ColorSet(red);

I think if (red == b) will cause more confusion than it's worth. If
we allow b += red;
then (b >= red) would mean the same thing as (b.contains(red)) and would
give the notational convenience we're searching for. If you are careful
with the helper functions, we would also be able to write if (red <=
b) which the cognoscenti would pronounce "if red in b".
We should both resist and _strongly_ discourage any suggestion that
"#define in <=" be used :)

in order to not get things confused with std::set, I recommend the word
powerset as the name of the class.

some further comments embedded in your example
At Saturday 2002/08/17 09:17, you wrote:

 I was wondering if anyone would be interested in a Set Library, similar
to Pascal's set type for ordinal types. I want to make sure someone
would fine it useful (and not trivial, done-to-death, etc.) before I
take the time to "Boostify" it and get a release from my employer.

A while ago I needed a set type that was compact, fast, and avoided heap
allocation. The number of elements was constant for a particular set
type, and the element type could be any integral type (primarily enums).
Similar types from STL are set, vector<bool>, and bitset, but none quite
suited the application. My set type (Set) stores elements at the bit
level, like vector<bool> and bitset, but it allows the creation of
compile-time set constants. Also, with a good optimizing
compiler, operations on constant sets will be done at compile-time,
though as of yet, there is no direct support for guaranteeing this in
the code. The class provides the usual set operations.

For efficiency, set operations are done in parallel groups of the
machine's word size. So, set types whose max. number of elements is less
than or equal to the machine's word size, will have most of its
operations done in constant time (usually one machine instruction per
operation). Sets whose max. number of elements is greater than the
machine's word size will be done in linear time, usually in E / Wb
cycles, where E is the max. number of elements and Wb is the machine's
word size in bits. Though, with a good compile (through inlining and
loop unrolling), this linear time would only be E / Wb instructions. On
an average 32-bit processor, that's just 8 instructions to perform the
union or intersection of a set of 256 elements. Originally, I avoided
loops and used metatemplates to inline the operations, but this resulted
in very long compile-times without any gains in performance! And
sometimes it resulted in worse performance. That will teach me not to
try and second guess today's optimizing compilers!

As I stated above, the class template may be instantiated with any
integral type, but it was primarily meant for enumerations with will
give you the added bonuses of type-safely and symbolic names for the
elements. The class assumes that the range given is not disjoint (i.e.,
0-63 and not 0 - 16, 19, 63). You may use disjoint ranges, but your sets
will contain wasted space. The only requirement is that the range of
values is greater than or equal to zero, and that the element type has a
representation for -1 (used as a (hidden) sentinel in argument lists of
elements). You could even conceivably use a class that was "castable" to
int.

For example:

// element type
enum Color {red, orange, yellow, green, blue, violet, NumColors,
NilColor = -1};

it certainly is a shame that we "need" to add NumColors to the enum.
BTW, are you suggesting here that NilColor is part of the enum?

// type for set of colors
typedef Set<Color, NumColors> ColorSet;

typedef Set<Color> ColorSet;
this would certainly be nicer, but.... C/C++ doesn't "know" how many
elements are in an enum or clearly won't let the programmer discover it.

// set constant
const ColorSet rgb = ColorSet::SetOf<red, green, blue>::value(); // this
is done at compile-time

void foo()
{
    // construction
    ColorSet s = ColorSet::SetOf<orange, yellow, violet>::value(); //
this is done at compile-time
    ColorSet s1 = ColorSet(a); // this *may* be at compile-time
    ColorSet s2 = ColorSet(orange, yellow, violet); // this is NOT done
at compile-time

I trust that ColorSet(orange) uses an explicit constructor.

                                                    // (but allows
non-constant elements)

    // operations, some of which *may* be done at compile-time by a
smart compiler
    ColorSet a = s; // assignment
    ColorSet b = s + rgb; // union

        ColorSet bx(s);
        bx += rgb;
        assert (b == bx);

    ColorSet c = s * rgb; // intersection
    ColorSet d = s - rgb; // difference
    ColorSet e = s / rgb; // symmetric difference
    bool b1 = s == rgb; // equality
    bool b1 = s != rgb; // unequality
    bool b3 = s >= rgb; // superset
    bool b4 = s > rgb; // proper superset
    bool b5 = s < rgb; // subset
    bool b6 = s > rgb; // proper subset

    if (b.contains(red)) // test for set membership
        std::cout << "b contains the element red" << std::endl;

    if (red == b) // same as above. can anyone suggest a better
operator?
                  // maybe use red -> b and use NULL for false?
                  // or maybe remove and just use b.contains(red)?

baaaaaad idea.... see start of message

        std::cout << "b contains the element red" << std::endl;
}

What do you think?

Thanks,

Rich
(spamjunk_at_[hidden])
 
Victor A. Wagner Jr. http://rudbek.com <http://rudbek.com/>
PGP RSA fingerprint = 4D20 EBF6 0101 B069 3817 8DBF C846 E47A
PGP D-H fingerprint = 98BC 65E3 1A19 43EC 3908 65B9 F755 E6F4 63BB 9D93
The five most dangerous words in the English language:
              "There oughta be a law"



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk