Boost logo

Boost :

Subject: Re: [boost] [Serialization] Bizarre bug
From: Robert Ramey (ramey_at_[hidden])
Date: 2009-08-07 13:20:32


Jarl Lindrud wrote:
> Robert Ramey <ramey <at> rrsd.com> writes:
>
>>
>> I'm still intrigued as to how such a situation could come up where
>> something is saved through a pointer but in another program it isn't.
>> I looked at the above link and didn't find this. Basically, my view
>> is that anyone tripped up by this error can't be writing anything
>> that makes sense anyway. I'm looking for a counter example to
>> this.
>>
>> Robert Ramey
>>
>
>
> I don't know this arose originally - there may well have been a
> situation where the vector<char> was serialized through a pointer in
> one program, and deserialized as a value, in another program. That is
> indeed an error, although in this case it does not seem that any
> exception was thrown to signify the error (the vector just silently
> failed to deserialize properly).

a) don't include class information in the archive
    advantage: faster performance since class info doesn't have to be
looked up or each instance
    disadvantage: no way verify that the type being loaded is the same as
that being stored

b) include information in the archive
    advantage: slower performance since class info doesn't have to be
looked up or each instance
    disadvantage: library can verify that the type being loaded is the same
as that being stored

I tried to choose reasonable defaults- it's the best I can do.

> On a more fundamental level, I don't understand how this code can
> *ever* fail:
>
> (1)
> std::ostringstream ostr;
> boost::archive::text_oarchive(ostr) & v0;
>
> std::istringstream istr(ostr.str());
> boost::archive::text_iarchive(istr) & v1;
>
> bool ok = (v0 == v1);

This code can only fail if the saving template for V is
different from that for V. Such a program can never work. Without
class information inside the archive, I can't detected this
type of error.

> Here the vector is serialized and deserialized as a value. How is it
> that "tracking" or "implementation" settings for a particular type
> can be affected by the mere presence of this code.

I minor correction - "implementation_level" is not effected. tracking
is effected if it has been set to "track_selectively"

Do you mean how is something like this implemented? The short
answer is that

T *t
...
ar << t

instanciates code which refers to a special static object which "remembers"
that this code has been invoked. This static object is initialized at
pre-main
time. So if anywhere else in the program one invokes

T t
...
ar << t

Tracking is turned on for this operation.

Note that this has the counter ituitive effect of seeming to "look ahead"
in the code.

> (2)
> ar << pVec; // pVec is a vector<char> *
>
> , or this code
>
> (3)
> ar >> pVec; // pVec is a vector<char> *
>
> , in a completely different part of the program?

> Snippet (2), if it appears in a program without snippet (3) also
> occurring, makes the program incorrect,

No it doesn't. The program is still correct. A valid
archive is created. This program or any other can
read this archive without problem by including (3). What
cannot be done is

> ar >> Vec; // Vec is a vector<char> (note missing *)

Since saving and loading types must match.

> And even if snippet (3) is also present somewhere in the program,
> thus making the program correct in your view, IIUC the archives
> produced in (1) will then have a different format than those
> produced, when snippets (2) and (3) were absent. That would imply
> that archive formats can change silently, through mere addition of
> source code, anywhere else in the program. And the format change
> wouldn't be noticed until some *other* program tries to read the
> archives and fails to do so. Have I understood that correctly?

sort of, The format change will be notices when some tries
to load a type different than the type saved. In this case, loading
a non-pointer type where a pointer type was saved.

Just remember this: If each load must use the exact same type
as each corresponding save.

If you follow this rule, you will never have a problem of this nature.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk