Boost logo

Boost :

From: Pavol Droba (droba_at_[hidden])
Date: 2004-06-06 06:26:34

On Sat, Jun 05, 2004 at 08:46:54PM -0400, Gennadiy Rozental wrote:
> "Pavol Droba" <droba_at_[hidden]> wrote in message
> > Hi,
> >
> > On Sat, Jun 05, 2004 at 08:16:17AM -0400, Gennadiy Rozental wrote:
> > > Hi,
> > >
> > > For long time in my daytime projects I was using my class token_iterator
> > > designed based on old iterator_adaptor design (adopted for old
> > > compilers).
> > > Now when need arose for such functionality in Boost.Test. I looked in
> > > direction of boost::token_iterator. After some struggle I end up
> > > adopting my
> > > version to new design. Here some of comments and issues that made me
> > > opt
> > > so.
> > >
> >
> > [snip]
> >
> > This might seem a little bit out of topic, but some of the points you have
> > addressed
> > are already solved by find/split_iterator provided in the string algo
> > library.
> > Maybe you can have a look there. Unfortunately, the documentation is not
> > 100% ready yet.
> >
> Looks like I was not able to express myself clearly in original post. But
> actually what you suggesting is especially what I was arguing *against*.

Maybe I misunderstood you. Maybe you misunderstood the concepts in the string_algo
> 1. Your library is dedicated to string algorithms. Why is it use range if
> iterators so frequently then? As an input and an output. I did found that
> having range of it iterators as a model of substring is very convenient.
> That what my basic_cstring is used for (actually, I may say that in my
> development this is single most widely used class). But it is full-fledged
> string as good as std::basic_string. And it does not based in iterators, but
> on character type.
> Iterator range does may be usable, BUT outside of string algo library.

It was clearly stated during the review the "string" is not equal to std::basic_string.
The string algorithm library is not std::basic_string extension, rather it
is a set of string-related alorithms. However with the help of collection/range
traits it provides almost the same level of convinience as if it would
be specialized just for a single string class.

Narrowing the implementation would be a giant leap back.

And for the iterator range: it provides a reference into the original string.
It if most efficient way to doing so, because you don't pay anything if you
want just read the content there. For other operation you can simply use
boost::copy_iterator_range, to extract the match to your favorite container.

> 2. Finder concept maybe useful in implementing different string search
> algorithm. BUT as implementation detail or generic case solution. I do not
> see any reason for interface that assumes explicit specification of finders
> provided by the library. There should be a generic algorithm/iterator that
> expect User defined finder, but there should be explicit
> algorithms/iterators dedicated for each type of search. I do that for
> algorithms, why not for iterators?

They are there actualy for most of the algorithms.
find/split_iterator is quite new stuff. Originaly it has been encapsulated by
find_all and split algorithms (they are still there, see split.hpp). During the review it has
been expressed, that pure iterator-base interface is better for split operations.
Therefor I have refactored the implementation so it can be used directly.

> > find_iterator is also based on generic iterator concept, however, the
> > iterator is the only
> > template parameter, so you can easily specialize for string.
> >

> Actually when I work with strings I do not want to know about iterators at
> all.

You don't have to, except for the one typedef.

> > Simple usage looks like this:
> In fact from your sources it seems that I have to write like this
> boost::split_iterator<std::string::iterator> it( str.begin(), str.end(),
> boost::token_finder(boost::is_any_of(";,"));
> IMO it's way to verbose, while still missing some of the functionality
> provided by the tokenizer library.

Actualy you have missed the important constructor. You can write:

boost::split_iterator<std::string::iterator> it( str, boost::token_finder(boost::is_any_of(";,"));

str will be expanded automaticaly. And you can also use char*, wchar_t of vector<char>.

> My choice:
> boost::token_iterator it( str, boost::dropped_delimiters = ";," );

It is very easy to implement forwarder to this kind of syntax. As I said, find_iterator
is relatively new, so I have extended the interface to full extend yet.
I will keep this in mind. Thanks for idea.

> This looks shorter and clearer IMO.
> Second concern about your solution, that makes it in a sense even worse than
> the one provided by boost::tokenizer, is that you actually does not specify
> type of the Finder in the iterator specification. The price is that
> eventually you have to, this way or another, pay with runtime overhead. This
> would be unacceptable to me.

This is a design point. First implemenation had templated specification of the finder.
However, the usage of such a class was very incovinient due to complicated specification.
Therefor it was changed to the current state. The runtime overhead implied
by the current implementation is rather small. I actualy only adds one indirection in
the increment operation. So it is neglible.

> There are couple of things that your solution provide in addition to what my
> and boost::tokenizer solution provide. Particularly it's an ability to
> specify several addition delimitation policies. I thought about this. But
> it isn't on top of my priorities (mostly because it is comparatively rarely
> needed). In any case this generic solution should not interfere with most
> convenient one used in majority of the cases.

Rationale in find_iterator is quite simple. There is a well defined concept of Finder
in the lib and there is also a bunch of finders already implemented there. find_iterator
is a natural candidate to use this facility.

I'll be greatful if you can show me where do you think, the string_algo lib is
lacking in the usablility. Preferably, provide also codesnippes how it is and how it
should look like. Both sides can benefit from improvements.



Boost list run by bdawes at, gregod at, cpdaniel at, john at