Boost logo

Boost :

Subject: Re: [boost] lifetime of ranges vs. iterators
From: David Abrahams (dave_at_[hidden])
Date: 2008-09-03 22:08:54


on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:

>>> But iterators like filter_iterator that need to know iteration
>>> boundaries, can store a range instead of an iterator, so they can
>>> benefit from the compression.
>>
>> Oh, I totally missed that; sorry! Let's see...
>>
>> strided_iterator<int*,5> stores range<int*>
>>
>> sizeof(range<int*>) == 2*sizeof(int*)
>>
>> filter_iterator<strided_iterator<int*,5> > stores
>> range<strided_iterator<int*,5> >
>>
>> range<strided_iterator<int*,5> > is represented as
>>
>> unique_data<strided_iterator<int*,5> >::type start;
>> unique_data<strided_iterator<int*,5> >::type finish;
>> common_data<strided_iterator<int*,5> >::type common;
>>
>> unique_data<strided_iterator<int*,5> >::type == int*
>> common_data<strided_iterator<int*,5> >::type == int*
>>
>> so its size is 3*sizeof(int*)
>>
>> ?? Did I get that right?
>>
>>
>> If so, it seems like there's still redundant storage. finish and common
>> will be representing essentially the same information. If I were to
>> write a filter_strided_iterator adapter and apply it to int*, it would
>> have the size of 2 int*s.
>> a
>
> What information is contained in common that is not also contained in
> finish (i've never implemented a strided iterator, I'm trying to
> understand) ?

My point is that there is essentially the same information in
common and finish, but that you can't eliminate it because the
abstractions don't allow it. A strided_iterator implementation and test
are attached.

> And why you can do without it in filter_strided_iterator?

Well, let me try to write that one, and we'll see. OK, done and
attached. Please keep in mind:

1. This was an unbelievably hasty job; surely I made a mistake somewhere.
2. No effort is made to optimize away storage for the predicate.

>>>>> Currently the iterators I use are often over 400 bytes, which is a bit
>>>>> to big :), thus the need to squeeze their size down as much as
>>>>> possible. A hundred bytes could be enough [1].
>>>>
>>>> Again, have you measured?
>>>>
>>>
>>> If you mean the size, yes, I've measured it. If you mean my assertion
>>> that 100 byte SBO may be optimal, then I wont' let facts interfere
>>> with my opinions :)
>>
>> I meant measured that a type-erased 100 byte iterator improves
>> performance over a bald 400-byte iterator.
>
> Eh? I never claimed it did! Barring heavy optimizations of indirect
> call, an any_iterator will always be much slower than a plain
> iterator. Any_iterator is not about performance, but is useful as a
> compilation firewalls and for ease of use.
>
> You want to make the buffer as small as possible, so the smaller your
> iterators are, the more likely they are to fit in the buffer.

Sure, I agree.

>>> Now, honestly, I'm trying to rationalize the fear that I had for a
>>> while that complex iterator adaptor sequences really can grow large.
>>> For example, the current filter_iterator double its size at every
>>> stacking. If you couple it with a relatively heavy predicate, the
>>> stack usage might start to really matter. On heavy threaded
>>> applications, stack usage is a somewhat limited resource.
>>
>> And I'm trying to say that the point of GP is to raise coding to the
>> highest possible level of abstraction *without loss of efficiency*. If
>> we can't achieve "relative efficiency" as described in
>> http://www.stepanovpapers.com/BYTE_com.htm, I don't think we're doing
>> the job right. So far, unless I'm mistaken, we're still missing the
>> mark by a fairly wide margin.
>
> Hum, I've entered the discussion with the with the explicit aim to
> reduce the size of iterators adapters and ranges while trying not to
> make them slower than they are right now. Making them faster was not
> the aim.

Space efficiency counts as much as speed efficiency. I don't think
we're really achieving either.

> (for the record, I think too that the exception approach is
> a bit of a dead end).
>
> I think your point is: do not bother with size expansion, because when
> the size will be big enough to be a problem, performance will already
> have reached the floor.

More like, "the overall performance cost associated with the wasted
space may well be swamped by other factors."

Although I'm not *really* saying "don't bother." I'm saying

  on the one hand, it doesn't look like we have any evidence that a
  real-world app will perform better because of these efforts

  on the other hand, if you *are* going to optimize space, don't aim so
  low. If the optimization is going to matter, it's going to matter
  that you did it right.

> My counterpoint is that every layer add more or less a constant
> overhead, while the size expansion grows quadratically, also, in
> situations were performance is not a problem, you might still care
> about the size.

You may have a point. It's just that the characteristic of this growth
is not one that would normally cause the program to blow up in the field
because the customer's problem size is larger than anything you tested:
even though it's O(N^2), there's nothing the customer can do to increase
N. Furthermore, one doesn't tend to create vast numbers of iterator
instances. So it doesn't seem so urgent to optimize it without some
data showing it will make an important difference.

-------



-------



-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk