|
Boost : |
Subject: Re: [boost] lifetime of ranges vs. iterators
From: David Abrahams (dave_at_[hidden])
Date: 2008-09-03 13:09:04
on Wed Sep 03 2008, Arno Schödl <aschoedl-AT-think-cell.com> wrote:
>> > rng | filtered( funcA )
>> > | filtered( funcB )
>> > | filtered( funcC )
>> > | filtered( funcD )
>> > | filtered( funcE )
>
>> 1. _Do_ they write that?
>
> Why not? They did for transform_range, and the notation is getting easier...
Yes, I never contested that they *could* write it.
I'm still asking you the same question: what real life computations
demand a solution to this problem?
>> 2. If they do that, I believe they are still paying an enormous
>> unnecessary runtime penalty even with your optimization because of
>> the endpoint checks in each layer of the composite iterator or
>> range. The only way IIUC to avoid such a penalty is to compress the
>> filter adapters into one, essentially by using expression
>> templates.
>
> See answer to 4.
>
>> 3. I am not yet convinced that you are suffering a significant
>> performance penalty due to the wasted space in real applications.
>> Have you measured it? Unless you're actually *storing* lots of
>> these stacked iterators or passing them around by value in your
>> inner loops, it seems unlikely to make much of a difference.
>
> Giovanni has seen the quadratic stack expansions.
Sure, but that does not necessarily amount to a significant performance
penalty.
> If you are worried about performance of pointers, why not worry about
> the performance of wrappers around pointers?
I actually am a bit worried about it. However, as I am trying to tell
you, I don't believe that what we're talking about here addresses the
whole underlying problem. You're talking about eliminating redundancy
across pairs of iterators, but AFAICT not the redundancy through the
vertical "stack."
>> 4. I can understand wanting to improve these components so that they
>> can be efficient under more conditions, and avoiding the redundant
>> storage would undoubtedly be a real improvement. However, I think
>> there's a deeper inefficiency in stacking all those filter iterators
>> together, and it's neither a range library's job, nor I think is it
>> possible in this case, to protect users from "blissful unawareness"
>> of performance implications.
>
> The design we came up with addresses the general problem of stacking
> range_adaptors that must store their end, like
> difference_range/filter_range/union_range/[other algorithms that I cannot think
> of now], in any combination.
OK, let me try to spell this out.
Let's suppose for a moment that you build a
filter_iterator<strided_iterator<int*,5>, F>. A strided_iterator is one
that visits every Nth element. Unless you know that there are an even
multiple of N elements, it needs to store an endpoint because otherwise
the iterator might stride right past the last stored location.
Thus sizeof(strided_iterator<int*,5>) == 2*sizeof(int*)
and its increment operator looks like:
if (pos <= end - N)
pos += N;
else
pos = end;
Now as you know, a filter_iterator also needs to store an endpoint. So
assuming F is empty and EBO is used to eliminate its storage, then
sizeof(filter_iterator<strided_iterator<int*,5>, F>)
== 2*sizeof(strided_iterator<int*,5>)
== 2*2*sizeof(int*)
and the filter iterator's increment looks like:
do {
++pos2;
}
while (pos2 != end2 && !f(*pos2));
Now if we expand the incrementation of pos2, it's
do {
if (pos <= end - N)
pos += N;
else
pos = end;
}
while (pos2 != end2 && !f(*pos2));
Whereas it should only be:
do {
if (pos <= end - N)
pos += N;
else
pos = end;
break;
}
while (!f(*pos2));
No approach that only addresses the space wasted across pairs of
iterators can help with these issues, so I think a more fundamental look
at the problem is called for. And again, I think the problem is very
much like what happens with segmented iterators so its worth looking
there for inspiration. It may be that, as with segmented iterators, a
more general form for the algorithms is called for.
> If you have thought up expression templates that cover all of these in
> a unified fashion with better performance than stacked iterators, I'd
> be very happy and try to implement that instead.
Again, I'm not ready for answers yet. I'm only just now discovering the
questions.
>> I'm not yet convinced we're solving a real problem here,
>
> I have enough of that stuff in my code that I am convinced we do.
That fact alone does not make it a real problem in my book.
>> and if it _is_ real, I am not quite convinced the general approach of
>> squeezing out redundant storage would address all of the problem.
>
> It does not, you already mentioned the superfluous end checks. As I
> said, if you have the whole solution, let me know.
I'm not there yet, but I know enough about the problem now that I
wouldn't "buy" a partial solution.
-- Dave Abrahams BoostPro Computing http://www.boostpro.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk