Boost logo

Boost :

Subject: Re: [boost] [Serialization] Bizarre bug
From: Robert Ramey (ramey_at_[hidden])
Date: 2009-08-09 13:50:09


Jarl Lindrud wrote:
> Robert Ramey <ramey <at> rrsd.com> writes:

> OK, I think we can wrap this up then... My understanding now is :
>
> -----------------------------------------
>
> * If one program writes archives like this:
>
> ar & vec; // vec is a std::vector<char>
>
> , it is then impossible to reliably load that archive in another
> program, unless one is aware of the entire compile-time content of
> the first program.

Nope. The only time a problem can occur if one does

    ar & vec; // vec is a std::vector<char>

In the saving program and

    ar & p_vec; // vec is a * std::vector<char>

in the loading progam. But this wouldn't work anyway.

> * When one writes code like this:
>
> ar << pVec; // pVec is a vector<char> *
>
> , it has the compile-time side effect of silently changing the
> archive format for *all* archives produced by that program, that
> happen to include a vector<char>.

As it must. If one is going to serialize through a pointer in one
part of the program, you have to include tracking information
in the archive for ALL ar << t to avoid creating multiple instances ]
when the objects pointed to are re-created. That is

ar << t; // save a t
...
// in some other module

ar << p_t // where t points to some t saved somewhere else in
the program - perhaps even in another source module!

In order to avoid created an extra t, you have to track ALL
serializations of t.

It seems that this could be a problem if the schema is changed
so that it was saved without pointers but now it is. But if
the schema is changed, you can't load the old archives anyway.
That's why the problem never comes up in real code.

Besides this, it's a special case because the serialization traits
for collections of primitives have tracking set to "track_selectively"
AND implemention level set to "object_serializable" which doesn't
include the historical tracking behavior inside the archive. The
default settings for user classes is to include the historical tracking
inside the class information in the archive so even if it's changed
in the future, old archives will still be read.

Making matters worse, the archive

> format change is compatibility- breaking.

Only for collections of primitives.

> IMHO, this is a rather fragile state of affairs, but it's your call
> and I won't argue it further. I'll go with the workaround, and
> document to users of my library, that if they use B.Ser. to serialize
> vector<char> (and presumably a number of other types?), they must
> always write
>
> BOOST_CLASS_TRACKING(vector<char>, track_never)

LOL - what if they want to do

ar << pVec // pVec is a pointer to vector<char>

and

ar >> pVec // pVec

without accidently creating pointers to two different objects when
they only saved one? That bug is going to a major bitch to find.
Maybe you want to use "track_always". Or maybe you want
to change the "implementation_level" so the next higher one
which stores tracking information in the archive itself.

(Caveat, Logically it would make sense, but I don't think the library as
it is currently implemented permits "overriding" previously set
serialization traits.
Maybe someday when someone has nothing else to do)

> , and that failure to do so is likely to trigger silent
> deserialization errors in other programs.

I would use "concievable" rather than "likely". In fact, I can't think
of any real program where this could occur without there being
some other error that would also prevent this program from
functioning.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk