Boost logo

Boost :

From: joel de guzman (djowel_at_[hidden])
Date: 2001-12-29 12:41:21


----- Original Message -----
From: "Nathan Myers" :

> joel de guzman wrote:
> > I posted something similar a few months ago:
> > http://groups.yahoo.com/group/boost/message/13490
> >
> > It is a full-fledged sparse bit set implementation using closed ranges.
> > It is now being used by the Spirit parser to allow 16/32 bit character
> > set support. For instance, in XML, there are lots of rules that work on
> > character ranges. The implementation has matured significantly since
> > then. It supports all the set operators inverse, union, intersection,
> > difference and xor. It has specializations for 8 bit ranges. Unlike the
> > extent_set being discussed, overlapping ranges can be added and
> > removed. The implementation coalesces and splits ranges as necessary.
> > Speed is 0(log N) where N is the number of ranges and O(1) for
> > 8 bit ranges. Internally the implementation uses vectors.
> >
> > Half-open ranges are unnatural for Spirit's use since grammars
> > such as XML are often defined using Reg-ex style character ranges.
> > Example:
> >
> > chset_t Char(L"\x9\xA\xD\x20-\xD7FF\xE000-\xFFFD\x10000-\x10FFFF");
> >
> > which is also equivalent to:
> >
> > chset_t Char =
> > ch_p(0x9) | 0xA | 0xD
> > | range_p(0x20,0xD7FF)
> > | range_p(0xE000,0xFFFD)
> > | range_p(0x10000,0x10FFFF)
> > ;
>
> This looks interesting. Can I see the code? The link in the referenced
> posting didn't work.
>
> A bit set implementation (sparse or otherwise) wouldn't be useful in my
> application. It will be easy to add general set operations to extent_set
> if anybody finds them valuable; in my application the greater efficiency
> of the more-restricted interface matters, and the preconditions matter not
> at all.
>
> Bit set and extent set representations are both useful, for different
> things. What happened with your submission?

You can see the code latest Spirit code at: http://spirit.sf.net/
You can peek at the CVS to get the chset implementation only.
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/spirit/spirit/boost/spirit/ch
set.hpp
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/spirit/spirit/boost/spirit/im
pl/chset.ipp

The actual code, although thoroughly tested thus far, is still coupled
to the parser code. I've yet to submit a stand-alone version, but spirit
folks are pushing me :-) to decouple many of the underlying infrastructure.
I agree. Too much too do, so little time, sigh....

--Joel


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk