Boost logo

Boost :

From: rameysb (ramey_at_[hidden])
Date: 2002-02-27 16:29:13


> pointer serialization...

> In general I think this is a pretty good idea. But, there are some
> devils in the details here. I have seen more than one serialization
> library run into trouble here. So I have three questions to think
about
> in judging your design (I haven't looked at the code, which I will
try
> to do when I find some free time):
>
> 1) Suppose I have polymphic object of class Derived, a child of
Base;
> suppose I try to write this object out twice, once through a
pointer to
> Base and once through a pointer to Derived. Am I guaranteed with
your
> implementation to write out this object only once, as an object of
type
> Derived? Even if the pointers have different numeric
representations
> internally?
>
> 2) Suppose I delete an object which has been serialized, then
create a
> new object in the same place, then attempt to write out the new
one.
> Under what circumstances will I get a new object in the archive?
> Essentially, this is asking over what timescale pointers to objects
are
> saved by the archive.
>
> 3) Suppose I have an object of class Composite which has a data
member
> of class Element. Suppose I have given out a pointer or reference
to
> that Element, which has been serialized. I now serialize Composite,
> which we should assume has a simple serialization implementation of
> simply serializing its elements, including Element. Will I write
out
> another copy of the Element? Will the answer be the same if I
reverse
> the order of these events?

Here's my answer:

The devil is in the details - properly serializing pointers is harder
than it looks - no argument from me on that point.

1) You are guarenteed to write out the object only once - regardless
of whether it is serialized through a pointer to a base class or
derived class. That is when a pointer is serialized - it is the most
derived class that is actually serialized. Tracking of duplicates is
done on the most derived class. This turns out to be essential to
serializing collections of polymorphic pointers - one of the most
useful features of this system

2) The key used to detect duplicates is the pair <class id of the
most derived class , address of the object> . So the scenario you
describe could produce suprises. In practice I wouldn't think it
would be a problem.The envisioned usage of the libary is to "take a
snapshot" of the state of a group of objects (and thier relatives)
and sometime later restore all the objects to the previous state.
This presumes that the state of the objects doesn't change when an
archive is being written or read. This implementation of the library
makes this assumption. I havn't considered what would happen if this
assumption didn't hold. In an attempt to enforce this assumption,
all the class member save functions are marked const. This should
prevent the data from being changed while being serialized.

3) For the moment - set aside the question of references which
involves some other issues. I believe your example is summarized in
the following code:

class Composite
{
    Element e;
}

class Composite2
{
    Element *eptr;
}

main()
{
    Composite c;
    Composite2 c2;
    c2.eptr = &c.e;

     // create archive
    ::boost::archive oa;
    oa << c;
    oa << c2;

    // .....

   // create input archive - deserialize in same sequence as above
   ::boost::iarchive ia
   ia >> c; // note: reversing order of two would create
   ia >> c2; // duplicate object of type element
}

I came upon this situation and after considering it concluded that it
would never occur. Now I'm having second thoughts. I will look into
this. My conjecture is that it is always possible to avoid such
situations. If this true then I can look in to adding
assertion/exception code to reveal when it occurs so it can be
addressed.

I should say I used the MFC serialization for many years and this
never occurred to me even though it includes pointer serialization.
This is the basis of my conjecture that it is not a common problem
and always avoidable.

I have to say this is a very astute observation - I'm sure we havn't
heard the last of this.

4) Serialization of references. This system cannot
serialize/deserialize a reference. class members that are references
are initialized during object creation an never change during the
life of the object. They don't really refer to the state of the
object so they really shouldn't be changed which is what
serialization does. Its a good thing to as it would a big problem to
implement this.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk