Boost logo

Boost Users :

Subject: Re: [Boost-users] [Range] Strange!
From: Neil Groves (neil_at_[hidden])
Date: 2014-08-29 13:08:26


> >
> > I see where you are coming from, I might even agree, but the impact upon
> > standard algorithms would be horrific. I don't find O(N) for
> > std::distance of a counting iterator upon integers to be acceptable.
>
> Are people passing counting_iterators to std::distance? Yes, it's a
> problem, but taking too long to get the right answer doesn't seem as bad
> as the random garbage the current behavior is giving.
>

Yes.

>
> > Clearly I can make the Boost.Range boost::distance O(1) by altering the
> > implementation to not forward to the standard implementation. This would
> > be required for almost all of the standard algorithms, hence I'd have to
> > rewrite them all.
>
> Yes. IMO, Boost.Algorithm should have versions of ALL the standard
> library algorithm, rewritten to work with the new-style iterator
> categories. And yes, it's a lot of work, but not as much as you might
> think. I should know, I'm just about finished my own complete
> reimplementation of the algorithms for my range work.
>

This is certainly something that can be done. I find it disappointing that
it will be a reimplementation not simply because it is additional work, but
because there doesn't appear to be progress in making the Standard better
in this respect. It would clearly be a better end result if we could make
the existing standard algorithms work in this manner. I also respect that
we might want to do something in the library in the interim to get results
before 2018!

>
> > Even with this effort the result would be sub-optimal
> > since there are many internal details within standard algorithm
> > implementations to optimise for various standard containers e.g. __niter.
>
> Not a big problem in practice, IMO.
>

It is for my use-cases. The __niter iterator extraction creates very
substantial performance gains. I appreciate that not everything is
performance-critical, but a lot of the work I do is. If it adds overhead I
often simply can't use it. The benefits when a string iterator is extracted
to a pointer yields 2x performance gains in several algorithms in my tests,
because it often selects contiguous pointer overloads for sub-algorithms.

>
> > I agree that it is a bug. The defect is simple, but the best solution
> > isn't. Your suggestion is valid, but it falls short of being an
> > acceptable solution. I was hoping to find a cunning way to keep the
> > counting_iterator random access while solving the lifetime issue. I've
> > not spent enough effort to convince myself that this is not possible yet.
>
> You prefer undefined behavior to a perf bug? Undefined behavior is
> exploitable. It's a security issue.
>

I am clearly not stating that we should do nothing. I wanted to consider
alternatives to immediately conceding that the demotion to input iterator
category was the best we could achieve. I understand the ramifications of
the defect. Having thought about the various trade-offs I am starting to
believe it would be worth considering returning by const value and having
the iterator category map to std::random_access_iterator_category. The
return by const value rather than value avoids an issue where a lifetime
issue if an algorithm uses auto& for the dereferenced result within the
calling function. While this would be non-compliant, it seems that the
cases where this would actually cause a problem are rather theoretical. I
have noticed that there is an int_iterator hidden in boost/iterator/detail
that works in exactly this proposed fashion. It would be interesting to see
how well this has worked.

The demotion to input iterator does not merely create a performance bug.
Standard algorithms (and user algorithms that use the standard iterator
categories) that require more than an input iterator will no longer compile
with the resultant counting_iterator. Therefore code will not only break,
there won't be an obvious adjustment to make to repair the client code.

Again, I'm not arguing that your proposal is not necessarily the optimal
one. I just feel that the ramifications of the change are sufficiently
abhorrent that we should ensure we have considered as many alternatives as
we can muster.

>
> > Boost.Range uses the Boost.Iterator traversal idioms extensively. The
> > Boost.Range Concepts are defined to avoid the reference type conflation
> > with traversal for example. Sadly, this doesn't help when the result is
> > fed into a Boost.Range algorithm that forwards to a standard algorithm.
> > The state of play is that Boost.Range prefers the Boost.Iterator idioms
> > but does not reimplement the standard parts and hence falls foul of the
> > standard issues.
>
> Great! Then only the algorithms need to be reimplemented. ;-) And
> actually, only those that require anything more than input iterators.
> That should cut the work load in half, or better.
>

I think an optimal implementation requires the reimplementation of each
algorithm that could attain performance benefit from traversal tag
dispatch? I don't think the difference matters much. I also believe there
are other significant potential advantages to defining a core set of
iteration algorithm primitives, and then implementing the remaining
algorithms in terms of these. It would allow easier use of segmented
iterators etc. without having to rewrite each individual algorithm. I am
therefore leaning toward writing the algorithms anyhow. I'll avoid
hijacking this thread further on this subject, and contact you directly to
optimise how we proceed avoiding duplication as much as possible.

Regards,
Neil



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net