Boost logo

Boost :

Subject: Re: [boost] lifetime of ranges vs. iterators
From: David Abrahams (dave_at_[hidden])
Date: 2008-09-03 13:09:04

on Wed Sep 03 2008, Arno Schödl <> wrote:

>> > rng | filtered( funcA )
>> > | filtered( funcB )
>> > | filtered( funcC )
>> > | filtered( funcD )
>> > | filtered( funcE )
>> 1. _Do_ they write that?
> Why not? They did for transform_range, and the notation is getting easier...

Yes, I never contested that they *could* write it.

I'm still asking you the same question: what real life computations
demand a solution to this problem?

>> 2. If they do that, I believe they are still paying an enormous
>> unnecessary runtime penalty even with your optimization because of
>> the endpoint checks in each layer of the composite iterator or
>> range. The only way IIUC to avoid such a penalty is to compress the
>> filter adapters into one, essentially by using expression
>> templates.
> See answer to 4.
>> 3. I am not yet convinced that you are suffering a significant
>> performance penalty due to the wasted space in real applications.
>> Have you measured it? Unless you're actually *storing* lots of
>> these stacked iterators or passing them around by value in your
>> inner loops, it seems unlikely to make much of a difference.
> Giovanni has seen the quadratic stack expansions.

Sure, but that does not necessarily amount to a significant performance

> If you are worried about performance of pointers, why not worry about
> the performance of wrappers around pointers?

I actually am a bit worried about it. However, as I am trying to tell
you, I don't believe that what we're talking about here addresses the
whole underlying problem. You're talking about eliminating redundancy
across pairs of iterators, but AFAICT not the redundancy through the
vertical "stack."

>> 4. I can understand wanting to improve these components so that they
>> can be efficient under more conditions, and avoiding the redundant
>> storage would undoubtedly be a real improvement. However, I think
>> there's a deeper inefficiency in stacking all those filter iterators
>> together, and it's neither a range library's job, nor I think is it
>> possible in this case, to protect users from "blissful unawareness"
>> of performance implications.
> The design we came up with addresses the general problem of stacking
> range_adaptors that must store their end, like
> difference_range/filter_range/union_range/[other algorithms that I cannot think
> of now], in any combination.

OK, let me try to spell this out.

Let's suppose for a moment that you build a
filter_iterator<strided_iterator<int*,5>, F>. A strided_iterator is one
that visits every Nth element. Unless you know that there are an even
multiple of N elements, it needs to store an endpoint because otherwise
the iterator might stride right past the last stored location.

  Thus sizeof(strided_iterator<int*,5>) == 2*sizeof(int*)

and its increment operator looks like:

  if (pos <= end - N)
      pos += N;
      pos = end;

Now as you know, a filter_iterator also needs to store an endpoint. So
assuming F is empty and EBO is used to eliminate its storage, then

  sizeof(filter_iterator<strided_iterator<int*,5>, F>)
  == 2*sizeof(strided_iterator<int*,5>)
  == 2*2*sizeof(int*)

and the filter iterator's increment looks like:

  do {
  while (pos2 != end2 && !f(*pos2));

Now if we expand the incrementation of pos2, it's

  do {
      if (pos <= end - N)
          pos += N;
          pos = end;
  while (pos2 != end2 && !f(*pos2));

Whereas it should only be:

  do {
      if (pos <= end - N)
          pos += N;
          pos = end;
  while (!f(*pos2));

No approach that only addresses the space wasted across pairs of
iterators can help with these issues, so I think a more fundamental look
at the problem is called for. And again, I think the problem is very
much like what happens with segmented iterators so its worth looking
there for inspiration. It may be that, as with segmented iterators, a
more general form for the algorithms is called for.

> If you have thought up expression templates that cover all of these in
> a unified fashion with better performance than stacked iterators, I'd
> be very happy and try to implement that instead.

Again, I'm not ready for answers yet. I'm only just now discovering the

>> I'm not yet convinced we're solving a real problem here,
> I have enough of that stuff in my code that I am convinced we do.

That fact alone does not make it a real problem in my book.

>> and if it _is_ real, I am not quite convinced the general approach of
>> squeezing out redundant storage would address all of the problem.
> It does not, you already mentioned the superfluous end checks. As I
> said, if you have the whole solution, let me know.

I'm not there yet, but I know enough about the problem now that I
wouldn't "buy" a partial solution.

Dave Abrahams
BoostPro Computing

Boost list run by bdawes at, gregod at, cpdaniel at, john at