Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-01-22 04:18:04


Robert Ramey wrote:

> >>Is it possible to always save proper identification when saving a pointer
> >> to polymorphic objects?
> >
> >This looks like an error to me. The intention is that it work as you
> > expected. I'll look into it.
>
> well, I've looked into it.
>
> basically the problem comes down to the following situation
>
>
> class B {}
> class D : public B {}
> ...
>
> D *d
> ar << d;
> ...
> B *b
> ar >> b;
>
> That is, the sequence of classes saved is different that the sequence of
> classes loaded. This issue is not related to just pointers but all object
> whether serialized through pointers or directly as objects.

Not exactly. For direct saving of objects, dynamic type is always the same as
static type. So, if you're saving object of one static type and loading it as
another static type, you're making a mistake.

With polymorphic pointers that's quite different. In the above case, I fail to
see any error on programmer's part.

> The central organiising principle behind the implementation of the
> serialization library is that classes are "seen" in the same sequence
> on saving and loading - thereby permiting the invocation of the appropriate
> serialization code.

IIRC, this principle is the reason why I argued for some other identification
scheme which we have now in form of BOOST_CLASS_EXPORT.

> In most cases an attempt to violate this principle is an unrecoverable
> error commited during compile time that isn't noticed until run time.
>
> In the special case of exported pointers, it is possible that such
> an error could be "fixed up" at runtime. The problem with this would be:
>
> a) it would require exported strings to always be included in archives
> regardless of whether or not they are strictly necessary. - thus
> provoking howls from those for whom maximum speed is a main
> consideration.

Is it really true? You only need to include this string to archive once: when
you've savining polymorphic pointer and static type is equal to dynamic one.
One string is not much, and after it's written, everything works as usual.
Moreover, if one saves polymorphic pointers and uses BOOST_CLASS_EXPORT,
he's likely to have such strings for all derived classes -- so one extra
string is not important for performance.

> b) it would "fix up" some mistakes but not detect others of a similar
> nature - e.g. when the save objects don't match loaded objects and
> the objects are not serialized though pointers.

I don't expect that saving *object* of type A and loading *object* of type B
should ever work, nor that it's diagnosed.

For pointer, I think not saving identification is just a bug. If I use
BOOST_CLASS_EXPORT for a class, it means I want saving of pointers to that
class work no matter what order orther classes are serialized or registered.

Consider:

    B* b = get_b();
    oa << b;

    // and later

    B* b;
    ia >> b;

If 'get_b()' decides to return really 'B' and not some derived class, you
don't save identification and proper restoring depends on order of saving.
So, if I change loading code to

    ia.template register_type<A>();
    B* b;
    ia >> b;

I'll read pointer of class 'A'. See attachment for testcase.

> c) it "fixes up" what I would presonally view as a programmer oversight
> which might well come back to haunt the programmer in an even
> more subtle form.

Can you give an example? Personally, I can't imagine any case where saving
pointer to derived type is worse that casing that pointer to base pointer
before saving.

> I've considered in the past the idea of a "debug option" for archives.
> The most common error in a complex serialization scenario - e.g.
> is when save/load get out of sync. This can easliy happen when
> a class definition is changed and an error is made in handling
> previous class versions. This can be torture to find - as you
> know from first hand experience. This would add extra information
> to the archive - basically begin/end class flags that would
> be checked on load.
>
> I have left this a future idea because - well you can imagine.

Sure ;-)

> Also, the introduction of the "&" serialization operator goes
> a long way to avoiding the whole problem in the first place
> so it occurs much less frequently than before.

But sometimes saving and loading is done in two different places. Especially
if you don't store one big object into archive. If that's the case, you can
write serialize method, and be done. When you're saving several small
objects, the code to save them can be scattered over many places, so some
problems appear.

- Volodya




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk