Boost logo

Boost :

From: David Abrahams (david.abrahams_at_[hidden])
Date: 2001-11-20 16:41:49


----- Original Message -----
From: "Mike Gradman" <windryder1_at_[hidden]>

> > Without having looked at your code, I'll surmise
> > that iterator stability is
> > acheived by embedding a reference to the vector and
> > an offset in the
> > iterator. I am not convinced that iterator stability
> > is usually worth the
> > cost, and I'd rather not pay for it if I wasn't
> > going to use it.
>
> This is a valid concern. The way we achieve iterator
> stability is we hold a long that is the element number
> in the vector + a version index that we use to update
> the position of the iterator if necessary. In fact,
> with our scheme you pay almost no penalty unless you
> insert elements into the container after the first
> lookup operation so we think we get about the minimum
> overhead possible using a direct vector to hold the
> data.

Sorry, I'm having trouble envisioning this. What does the iterator
dereference operation do? How does it find the element without a reference
to the vector?

>
> > My plan for
> > implementing stability was to provide
> > pointer/reference stability via an
> > indirect_container adaptor which uses
> > boost::indirect_iterator (In fact,
> > this is the work that spawned the Boost Iterator
> > Adaptors Library). This
> > approach has several advantages, not least that
> > reallocating the vector
> > becomes very cheap because it just has to copy POD
> > data.
> >
> > -Dave
>
> Well, the big concern we would have with this is
> speed and efficiency. It is true that you get
> exception safety but the indirect approach has two
> disadvantages:
> 1. More overhead have to keep an extra pointer around
> versus just holding the data itself.

Yes.

> 2. More seriously, the big problem would be that with
> the indirect approach the elements are no longer held
> contiguously in memory and you have to go through a
> pointer to compare them. The big advantage that our
> benchmarks show for a vector is that because the
> elements are held directly and contiguously in memory,
> sorts are very fast on the resulting data giving the
> final container a big advantage over multiset.

True; locality is everything ;-). Using a pool allocator for the elements
would help a little, but you have a point. I guess it really depends what
you're holding, though. If the elements are, say, std::vector or std::string
you might do better with my scheme (depending on how many insertions there
are) because the real data is at the end of a pointer anyway.

> If you
> give this up and hold the data via a container of
> pointers it seems likely to me that you would take a
> large performance hit as a result and perhaps not be
> any faster than a multiset. (Two reasons, memory
> contiguity and a functor hit to compare the (data *)
> items).

Err, multiset is slow for all kinds of reasons. For one thing, it's BIG.

> Of course, the proof is in the pudding, Corwin
> and I can do some simple benchmarks to see.

Sounds good!

> Anyway, so we think that our container offers some
> nicer performance advantages over the previous
> submissions while retaining iterator stability.
> Anyway, Corwin and I will do some more benchmarks to
> show this.

I'm interested.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk