Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2004-09-08 11:06:54


 
Joaqu?n M? L?pez Mu?oz wrote:
        

>Ummm, I guess I'm not explaining myself, let me try with an example:
>consider a multi_index_container like this:

>typedef multi_index_container<
> int,
> indexed_by<
> sequenced<>,
> sequenced<>
> >
>> bilist;

>Then, when serializing such a bilist, the order maintained by indices #0
and
>#1 must be recorded separately. Roughly speaking, this involves:

>1. serializing the elements in the order induced by index #0.
>2. traversing index #1 and serializing a record of rearrangements with
>respect to index #0.

>Step 2 can be accomplished in a simple manner by serializing pointers to
>the (previously serialized) elements, but in the order induced by index #1.
>As Boost.Serialization keeps track of previously stored objects, this will
>work OK. To further exemplify this, consider a particular bilist like this:

>index #0: [begin(),end) = 1 4 3 5 2 6
>index #1: [begin(),end) = 1 3 5 6 2 4

>then all the info we need to reconstruct the container can be saved as:

>1 4 3 5 2 6
>&1 &3 &5 &6 &2 &4

>where & means we save a pointer to a previously serialized element.
>I hope everything's clear up to this point.

This looks clear to me. It seems well within the normal application of the
serialization library and would seem to work as expected.

Question: have you built and tested this? I think this would be a useful
starting point for what follows. I will refer to this as the plain vanilla
implementation. (PVI)

Assuming that the above works as expected, I understand that your concern is
that the sequences of pointers could be reduced to a smaller sequence. The
question is what would be the best way to exploit this possibility.

One way would be just keep a list when serializing index #0 and referring to
it when serializing index #1. This would give you everything you need
without changing anything about the serialization library.

Of course, this does replicate some of the information within the
serialization library.

I would be extremely reluctant to make the library more elaborate to
accommodate what I see as a very special case.

If it were me, I would think the PVI was plenty good enough as the space for
a replicated pointer is very small relative to the data as whole. The PVI
also has a pleasing symmetry in that no index is "privileged" by being first
and will work automatically for an arbitrary number of indexes.

But that's my opinion and its your library.

We have one alternative - keeping your own list of saved/loaded object
addresses which can be used to streamline the process.

There might be another way that would give you what you want.

Rather than thinking:
a) serialize data
b) serialize index 0
c) serialize index 1
d)...
e).serialize index n

which is the way that the serialization library encourages one to think. One
might consider:

a) serialize data
b) serialize all indexes in parallel:
        i) read indexes in order
        ii) "merge indexes" into a virtual list
        iii) for each unique index entry
                1) serialize the list of indexes to which it applies
                2) serialize the index entry itself

I think this would be "sort of" equivalent to what you want to do in that
replicated pointers would not be stored. This would even eliminate the need
to track the addresses of the serialization of base data allowing one so
designate all instances of indexed_set as not using tracking. So even the
address-object id lookup inside the serialization library would be skipped.
This would be maximally efficient.

Personally, I'm skeptical that it would be worth the effort - but of course
its up to you. One thing I would like to see is that the "Plain Vanilla
Implementation" is working before something like the above is considered.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk