Boost logo

Boost :

Subject: Re: [boost] [Serialization] Bizarre bug
From: Jarl Lindrud (jarl.lindrud_at_[hidden])
Date: 2009-08-07 19:31:33


Robert Ramey <ramey <at> rrsd.com> writes:

>
> I tried to choose reasonable defaults- it's the best I can do.
>

I've got no problem with the default setting - it's obviously got to be one way
or the other. What I think is unfortunate is that the settings change, based on
compile-time instantiation *anywhere* in the program.

> > On a more fundamental level, I don't understand how this code can
> > *ever* fail:
> >
> > (1)
> > std::ostringstream ostr;
> > boost::archive::text_oarchive(ostr) & v0;
> >
> > std::istringstream istr(ostr.str());
> > boost::archive::text_iarchive(istr) & v1;
> >
> > bool ok = (v0 == v1);
>
> This code can only fail if the saving template for V is
> different from that for V. Such a program can never work. Without
> class information inside the archive, I can't detected this
> type of error.
>

But V is vector<char>, and its serialization definition is known - it is given
by the library itself, in vector.hpp . I haven't touched it.

> > Here the vector is serialized and deserialized as a value. How is it
> > that "tracking" or "implementation" settings for a particular type
> > can be affected by the mere presence of this code.
>
> I minor correction - "implementation_level" is not effected. tracking
> is effected if it has been set to "track_selectively"
>
> Do you mean how is something like this implemented? The short
> answer is that
>
> T *t
> ...
> ar << t
>
> instanciates code which refers to a special static object which "remembers"
> that this code has been invoked. This static object is initialized at
> pre-main
> time. So if anywhere else in the program one invokes
>
> T t
> ...
> ar << t
>
> Tracking is turned on for this operation.

Hmmm. This implies that the meaning of the code "ar << t" , depends on the
content of the entire program.

>
> Note that this has the counter ituitive effect of seeming to "look ahead"
> in the code.

What's worse is that it also seems to "look across" :)

> Just remember this: If each load must use the exact same type
> as each corresponding save.
>
> If you follow this rule, you will never have a problem of this nature.
>

Let me sum up what it is that worries me. Say I have program A, with this kind
of code:

    std::ostringstream ostr;
    boost::archive::text_oarchive(ostr) & v0;

    std::istringstream istr(ostr.str());
    boost::archive::text_iarchive(istr) & v1;

    bool ok = (v0 == v1);
    
, and nothing else. Say that I also have another program B, with the same kind
of code, that reads archives produced by program A.

Everthing is working well. And then in program A, at a later stage, some more
code is added, containing these lines:

ar << pVec; // pVec is a vector<char> *

, and

ar >> pVec; // pVec is a vector<char> *

Perhaps the code is not even executed. Now, program A still functions. It can
read the archives that it has produced *but* the archives now, IIUC, have a
different format. So when program B tries to read the archives, it will fail,
and may well fail silently.

So essentially addition or modification of code, *anywhere* in program A, and
regardless of whether it is executed, can cause a runtime failure in program B!
You can imagine how difficult this would be to debug.

Wouldn't it be easier if serialization settings like tracking_level were set by
the user, explicitly at runtime? That way one could set it exactly as one wants
it, rather than having to guess at the side effects of compile-time
instantiations throughout the entire program.

Regards,
Jarl.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk