Boost logo

Boost :

From: Nathan Myers (ncm-yahoospam_at_[hidden])
Date: 2001-12-28 18:48:02


joel de guzman wrote:
> I posted something similar a few months ago:
> http://groups.yahoo.com/group/boost/message/13490
>
> It is a full-fledged sparse bit set implementation using closed ranges.
> It is now being used by the Spirit parser to allow 16/32 bit character
> set support. For instance, in XML, there are lots of rules that work on
> character ranges. The implementation has matured significantly since
> then. It supports all the set operators inverse, union, intersection,
> difference and xor. It has specializations for 8 bit ranges. Unlike the
> extent_set being discussed, overlapping ranges can be added and
> removed. The implementation coalesces and splits ranges as necessary.
> Speed is 0(log N) where N is the number of ranges and O(1) for
> 8 bit ranges. Internally the implementation uses vectors.
>
> Half-open ranges are unnatural for Spirit's use since grammars
> such as XML are often defined using Reg-ex style character ranges.
> Example:
>
> chset_t Char(L"\x9\xA\xD\x20-\xD7FF\xE000-\xFFFD\x10000-\x10FFFF");
>
> which is also equivalent to:
>
> chset_t Char =
> ch_p(0x9) | 0xA | 0xD
> | range_p(0x20,0xD7FF)
> | range_p(0xE000,0xFFFD)
> | range_p(0x10000,0x10FFFF)
> ;

This looks interesting. Can I see the code? The link in the referenced
posting didn't work.

A bit set implementation (sparse or otherwise) wouldn't be useful in my
application. It will be easy to add general set operations to extent_set
if anybody finds them valuable; in my application the greater efficiency
of the more-restricted interface matters, and the preconditions matter not
at all.

Bit set and extent set representations are both useful, for different
things. What happened with your submission?

Nathan Myers
ncm at cantrip dot org


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk