|
Boost : |
From: jbandela_at_[hidden]
Date: 2000-08-25 09:45:42
Thanks for your post. As I have been thinking of it, I am agreeing
with you about using iterators. You can find an implementation that
uses iterators in TokenIterator\TokenIteratorUpdate.zip
The new version, is a forward_iterator unless the underlying iterator
is an input_iterator in which case it is also an input_iterator.
In addition, to the advantages you mentioned that were gained from
using iterators. I found another very important one. Using iterators
allows chaining of various parsing iterators. For example, if you
were writing a calculator, you could have the top level iterator
implemented in terms of a lower level iterator that returns strings
and punctuation, thus simplifying the code.
Thanks again for your input. I would be most interested in what you
have to say about the updated version.
--- In boost_at_[hidden], "Aleksey Gurtovoy" <alexy_at_m...> wrote:
>
> ----- Original Message -----
> From: <jbandela_at_u...>
> To: <boost_at_[hidden]>
> Sent: Wednesday, August 23, 2000 9:32 AM
> Subject: [boost] Re: Interest in a token iterator???
>
>
> > I had thought about doing it. However, it seems the main benefit
> > would be when using istream_iterators.
>
> The main benefit is that with such interface you could use a pair of
> iterators of *an arbitrary type* to specify an input for your token
> iterator; you would not be forced to use 'std::string' or any other
> container-like object to specify just an input sequence to be
tokenized;
> IMO, pair of iterators is much more natural way to do it and it's
more
> generic.
>
> > Other than that, a char buffer
> > is easily converted to a string.
>
> Sometimes an overhead of that conversion (copying) may be
unacceptable or
> data may be just too big to place it in memory (and you don't want
to
> process it in chunks).
>
> > The problem with istream_iterators
> > is that they are input iterators. If you have two input iterators
> > referencing the same sequence, modifying one, modifies the other.
> > Thus, if token iterator was implemented in terms of an input
> > iterator. Having independent copies of the iterator that reference
> > the same character sequence would be impossible. This would mean
that
> > you could not pass token iterators into algorithms such as copy or
> > find, without having your original token iterators modified.
>
> Strictly speaking, that's not true ;) Your statement is correct
only for
> 'token_iterator< istream_iterator<...> >', or whatever other
> 'token_iterator<>' parameterized by an iterator type which
satisfies only
> input iterator requirements. But in that case you are getting what
you've
> asked for.
>
> I think what you need is a proper definition of the concept, which
states
> that the iterator category of the token iterator depends on the
iterator
> category of iterators used to iterate through original input
sequence - e.g.
> 'token_iterator<std::string::const_iterator>::iterator_category'
can be
> 'std::forward_iterator_tag'.
>
> > This
> > could also seriously affect algorithms that depend on lookahead
> > features (ie = vs == in C/C++).
>
> If you accept my definition above, it will not (or it will only in
case if
> you want so :).
>
> > Finally, there is the ownership
> > issue. If the sequence is modified or deleted, the token iterators
> > could become corrupt.
>
> I don't think that's a problem. After all, if you modify or delete
a vector,
> all its iterators become invalid too :)
>
> > Based on all this, I decided to use strings
> > that the token iterator owned. In addition, since the string is
never
> > modified, the string is reference counted and shared between all
> > copies of a token iterator. This makes copying them pretty cheap.
> >
>
> Sorry, I don't like this. First, you create a copy of the StringType
> parameter on the heap, which may be quite expensive. Second, using a
> reference counting may be a show stopper for possible users of the
class who
> have to deal with multi-threading (it's not thread-safe, isn't
it? ;).
> Third, IMHO, the problems you were trying to solve might be not
indeed a
> problems, so we can get rid of all these complications if you agree
with my
> points.
>
> Does it make any sense to you?
>
> --Aleksey
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk