Boost logo

Boost :

Subject: Re: [boost] [MultiIndex] Random Access mental model?
From: David Abrahams (dave_at_[hidden])
Date: 2008-10-31 19:47:59

on Fri Oct 31 2008, Joaquin M Lopez Munoz <> wrote:

> David Abrahams <dave <at>> writes:
>> Hi,
> Hi Dave, allow me to address this post of yours first and defer
> the answer to the others you've send on Boost.MultiIndex, as
> those will need more elaboration.

Thanks. This is the most important one.

>> I'm trying to get a grip on what it means when I have a random access
>> index in a multi_index_container. I have no mental model for what data
>> structure you're actually creating, which makes it difficult to evaluate
>> multi-index vs. hand-rolling my own container.
>> If I was doing this myself, I'd use a combination of these two:
>> hash_map<key, value>
>> deque<hash_map<key, value>::pointer>
>> because I need to maintain an LRU cache of what has been put in the map
>> (i.e. I need O(1) push_back and pop_front). MultiIndex only lets me use
>> sequenced<> which it implies essentially has the properties of
>> list<hash_map<key, value>::iterator>... although I can imagine you could
>> save a lot by using the same nodes for the hash_map and the list. Do
>> you do that?
> Yep, this is done. Only one node per element is allocated. This is
> true regardless of the indices used, including random
> access indices. If you want to visualize this, each element node
> has as many node headers as indices are, each one stacked on top
> of the next.

I think random access shouldn't need to add a node header unless you
want to do something like look up a key in the hash index and delete
that element. Maybe a feature-model interface would be a more powerful
way to approach this space, so we don't have to pay for what we don't

> Some indices (namely, hashed and random access) use additional data
> structures external to the nodes themselves (this is obvious in the
> case of hashed indices, as they are entirely similar to
> unordered_set).

Yes, and I expect random access indices to do the same.

>> Regardless, I could still save roughly one pointer per node with my
>> deque. I'd think about trying to modify MultiIndex to let it do random
>> access with deques, but I can't make head or tail of the implementation
>> code used for random access.
> The data structure of a random access index is basically the
> same as depicted for a little container I call stable_vector:

That's what I figured... although the back-pointers are only needed if
you need to delete and you can't get ahold of an iterator into the
vector -- the same case I mentioned above.

> So, each element is stored in its own node (as is been already
> said) and there is an additionally internal vector of pointers
> that keeps the elements together while providing O(1)
> access. The up-pointers described in the aforementioned article
> are necessary to provide stable iterators.
> The overhead of a random access index is one pointer per
> node (the up-pointer) plus the size of the internal pointer
> vector. Some figures are calculated at:
> The deque-based structure you propose would save, as you
> say, roughly one pointer per node with respect to sequenced
> indices and also with respect to random access indices,
> which have the up-pointer your proposed data structure lacks,
> while on the other hand it requires an extra pointer as the deque
> stores pointers while multi_index_containers enjoy the header
> stacking property referred to above.
> So, roughly speaking, things are even.

Sorry, I don't see it. Since I need O(1) pop-front I am using
sequenced<> which IIUC is a doubly-linked list. Therefore your nodes
have this overhead:

  next-in-list, prev-in-list, and next-in-hash-bucket = 3 pointers

and mine have only next-in-hash-bucket. Add one pointer of overhead in
the deque and I am storing 2 pointers.

> On the other hand, this up-pointer is essential to ensure
> iterator stability and iterator projection, which are features
> that currently the lib take for granted for all the different
> index types available.

In my application, the ability to erase nodes through hash iterators
doesn't matter, but the per-node storage cost does.

>> Could someone clarify what's going on in there?
> Hope the answer helped clarify the issue. Please come back otherwise.

Except for the fact that we are coming up with different numbers for
the data structure overhead, yeah.


Dave Abrahams
BoostPro Computing

Boost list run by bdawes at, gregod at, cpdaniel at, john at