Boost logo

Boost :

From: Joaquín Mª López Muñoz (joaquin_at_[hidden])
Date: 2004-12-01 13:58:49


Robert Ramey ha escrito:

[snip]

> I did peruse the source code some. Here are some random observations.
>
> First of all on the multi-index package:
>
>

[snip]

You'll make me blush :) thank you.

>
> c) I believe that this is the reason the very ambitious undertaking "sailed
> thtough" the boost review process - (unlike most others). Aspiring library
> authors should study this as an example.

I think I have to give here due credit to Pavel, who acted as my
private reviewer for almost a year. I objected to many of his
criticisms, but he definitely never let me go sloppy. I think this
mentorship role should be stressed more here at Boost.

>
>
> Re the serialization aspect.
>
> I have to say I was a little disapointed that the implemention of
> serialization wasn't more transparent. I really couldn't follow the line
> from the serialization interface to the implementation in the time I was
> willing to spend. So I don't feel I can verify the implementation other
> than by testing. This always makes me feel sligthly uncomfortable. This
> isn't really a criticism and I'm not suggesting any changes. Only in
> reviewing the code did I become aware for the first time how much is
> required, so maybe it can be no other way.
>
> I'm a little disappointed at how much effort was required to implement
> serialization for this container. My hopes were that implemention of
> serialization for any class would be easier. Of course this is not a
> typical case so its not a huge thing. I'm curious if any of the complexity
> was a result of some requirement of the serialization package itself.
>

Yes and no. Boost.Serialization interface forces me to do things in
weird ways, but I'm not sure this can be improved (I have a suggestion,
though, please read on). Let me elaborate: Loading an element into a
(any) container involves the following ops:

load_contruct_data(element);
ar>>element;
container.insert(element);

So, the element cannot be restored *in-place*, i.e., directly inside
the container, as it is the container itself that controls object creation
thru its allocator. This is a restriction with containers, rather than
any serialization package.
But now comes the problem. From a data structure
point of view, you can conside a multi_index_container as a bunch
of elements plus N different rearrangements of these, one for index.
These rearrangements are archived more or less as sequences of
pointers to the elements. On a first approach:

save elements
for each index{
  for(iterator it=index::begin...index::end){
    ar<<&(*it); // save a pointer to the element
  }
}

But this scheme does not work, because on loading time object
tracking is tied to the element as first constructed, and not its
copy inside the container:

load_contruct_data(element);
ar>>element; // Loaded pointers will be pointing here
container.insert(element);

Got it? This accounts for some of the complexity in the
implementation of multi_index serialization. Basically, what
I'm doing is to serialize both the element and its position on
the container (the latter being done in index_node_base.hpp).
The position thing is merely a marker, i.e. it does nothing
but to force Boost.Serialization to track subsequent pointers
to the right address. In fact, its serialize() memfun does nothing.
In pseudocode

// saving
save elements
for(each element){
  save position(element) // does not emit info
}
save indices as pointers to the positions

// loading
load elements
for(each element){
  load position(element) // instructs Boost.Serialization about
  // where subsequent pointers have to be tracked to
}
load indices

I hope I made myself clear. The problem is not particular
to multi_index_containers, it'll also pop up in any situation involving
pointers to elements in a container. I can workaround the problem
cause I have direct access to the representation of multi_index_container
(the position thing) but when serializing pointers to STL container
elements there's no way around AFAICS. I think Boost.Serialization can
be extended to offer better support for this thru one of this
mechanisms (or both):

1. Allow the user to "retrack" an object, i.e. to instruct
Boost.Serialization on loading time that pointers to an object
have to be displaced to a user-defined address.
2. Define a special entity (a la make_nvp) that serves to
serialize external objects, i.e.

ar<<make_external(obj);
ar<<&obj;
...
ar>>make_external(obj); //obj is preexistent
ar>>obj_ptr; // will be pointing to obj.

In the pseucode above, obj is not really serialized nor does
Boost.Serialization attempt to construct it on loading time,
yet it is possible to serialize pointers to it.

Have I made myself cleer? I'm aware the explanation is fuzzy,
but I hope you got my point. Otherwise, please let me know
so that I can try to express myself clearer.

As for the rest of the complexity in the implementation of
multi_index_serialization, it has to do with some algorithms
to code indices as compactly as possible, basically by
archiving "diff" subsequences wrt to the base sequence. This
stuff is in index_matcher, index_loader and index_saver.
This complexity is in no way related to Boost.Serialization.

Sorry for the long post,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk