Boost logo

Boost :

Subject: Re: [boost] [range] Is it conceivable to have it extended?
From: Vladimir Batov (Vladimir.Batov_at_[hidden])
Date: 2014-07-13 21:01:02


On 07/14/2014 10:00 AM, Andrey Semashev wrote:
> On Monday 14 July 2014 07:10:23 Vladimir Batov wrote:
>> boost::iterator_range is a great concept/class but it does not seem to
>> play efficiently with certain input sequences and seem to be too strong
>> a requirement for certain algorithms. I have an old-fashioned C string
>> in mind (I am sure there are other examples) and traversing algorithms.
>>
>> When I tried deploying boost::iterator_range for string-to-int
>> conversion purposes, I quickly realized that I was traversing the string
>> twice -- first to find the end and second to do the work.
> Not sure why would you need to find the end - it can be found in the parsing
> process (that is assuming the integer may not be the whole string; otherwise
> the end iterator is already provided by the range).

"Otherwise" means I have to have two implementations for ranges (where I
check it != end) and for sequences (like C strings) where I do not have
"end" and, therefore, check "for (char* p = beg; *p; ++p)".
Alternatively, to have one implementation, we usually convert a C string
to a range with boost::as_literal() and then call the range-based
function. boost::as_literal() does the first traversal to fing the end.

>> What I think is missing is the "parent" concept of a "sequence". Say,
>>
>> template<iterator_type, sentry_type>
>> struct sequence
>> {
>> iterator_type begin();
>> sentry_type sentry();
>> };
>>
>> Then "range" would implement and refine the "sequence" concept by
>>
>> template<iterator_type>
>> struct range : sequence<iterator_type, iterator_type>
>> {
>> iterator_type begin();
>> iterator_type end();
>> sentry_type sentry() { return end; }
>> }
>>
>> I.e., as "range" implements/is the "sequence", the boost::iterator_range
>> will have one additional "useless" sentry() method.
>>
>> That way all the old code (using and/or needing end()) would still work
>> but traversing algorithms might gradually adapt to only require
>> "sequences" instead of "ranges":
>>
>> template<typename Iterator, typename Sentry, typename Function>
>> Function
>> for_each(Iterator beg, Sentry sentry, Function fun)
>> {
>> for (; beg != sentry; ++beg) fun(*beg);
>> return fun;
>> }
>>
>> That means that now all "open-ended" (with no known "end") "ranges"
>> (they are not even "ranges" in the strict meaning of the word but rather
>> "sequences") can be handled efficiently.
> I don't see much difference with the current ranges, where end iterators are
> singular. Can you provide a strong motivating example for the sentry() method?

The difference is that "sentry" does not have to be an iterator... and
it is not in the case of "sequences". Below is the "extended" range
(which is a "sequence") specialization for a C string:

     template<typename T>
     struct range<T*, typename enable_if<cnv::is_char<T>, void>::type>
     {
         typedef typename remove_const<T>::type char_type;
         typedef T* iterator;

         struct sentry_type
         {
             friend bool operator!=(iterator it, sentry_type const&) {
return !!*it; }
         };

         range (T* b, T* e =0) : begin_(b), end_(e) {}

         iterator const begin () const { return begin_; }
         iterator const end () const { return end_ ? end_ : (end_ =
begin_ + strlen(base_)); }
         sentry_type sentry () const { return sentry_type(); }

         private:

         iterator begin_;
         mutable iterator end_;
     };

That allows traversals without finding the end (i.e. extra traversal to
find "end").

Namely, it still works with std::for_each(). However, is now can work
more efficiently with the "for_each" I showed above, i.e.

     char const* str = ...;
     range<char const*> r(str);

     for_each(e.begin(), r.sentry(), fun);

without finding the "end" first.

V.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk