Boost logo

Boost :

From: John Maddock (John_Maddock_at_[hidden])
Date: 2002-02-10 08:12:53


>I'd like to work on classes for representing ranges. Therefore, I'd like
to
>start a discussion on what a range should do, what the specs are, etc. Is
>there enough interest in boost to do this?

Yes absolutely.

I'm going to go off on a slight tangent though because in several places we
require ranges of strings (regex, tokeniser, and Darin Adler's string
algorithms), there is as yet no standard way to represent them.

regex uses something like a pair of iterators.
tokeniser just copies the range into a std::string (but can easily be
modified to use something else).
DA's string algorithms use a kind of substring (but one that's inherently
tied to std::basic_string).

It seems to me that we've all been trying to avoid the issue as much as
possible, so recently I've begun work on a substring class - one that
represents any iterator range - but that "looks like" a string.

It's about 2/3's implemented right now, but the main features are:

interface:
~~~~~~

template<class ForwardIterator, class traits =
detail::substring_traits_selector<ForwardIterator>::type>
class basic_substring;

should speak for itself, the substring_traits_selector class is just a way
to select the correct std::char_traits template (you need to strip
cv-qualifiers from the iterators value_type, which makes the implementation
non-trivial). The ForwardItertor type is restricted to const-iterators;
this seeming restriction is deliberate, and does not effect the usefulness
of the class IMO, it also prevents const-correctness being broken
(otherwise a const substring would still have non-const iterators).

members:
~~~~~~~

Roughly you can do anything with a substring that you can with a const
std::string, except call c_str() or data(). I've also not implemented the
find* member functions yet (I'm ambivalent about whether they should be
present). You can also compare a substring to any "string like object",
add it to any "string like object" to create a std::basic_string<>, and
construct from any "string like object" that has the right iterator type.
By "string like object" I mean any type that has a specialisation of
string_traits defined. string_traits is a really tiny traits class that
allows a string like object to be converted to an iterator pair;
specialisations are provided for std::basic_string, substring, charT*,
char, and wchar_t. Other specialisations for vendor specific string types
allow substring (and any substring or iterator based algorithm) to
inter-operate with that string type. In other words if you're stuck to
using Microsoft's CString, or Borland's AnsiString you would still be able
to use these with any string algorithms we provide (including regex
eventually).

State:
~~~~

Perhaps surprisingly a substring has three states:

A valid non-empty range.
A valid but empty range.
An invalid range (begin() and end() return singular iterators).

the last is required because in string algorithms (particularly regexs)
returning the end of sequence iterator to denote "no match found" is not
acceptable - the end of sequence iterator may denote a valid albeit zero
length match.

Anyway this is maybe a little sketchy for now, but I hope to be able to
post a more detailed description soon. The reason for posting now, is I'm
not sure what if any overlap there is between a substring class and a range
class. My gut feeling for now is that these are like vector and string -
similar in concept, and sharing some of the same interface but otherwise
unrelated, on the otherhand there is a case to be made that a substring "is
a" range object. As usual any thoughts are welcome.

- John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk