Boost logo

Boost Users :

Subject: Re: [Boost-users] [serialization] class versioning changes in boost 1.42
From: Robert Ramey (ramey_at_[hidden])
Date: 2010-02-25 12:18:38


Jarl Lindrud wrote:
>> The version # was envisioned as a small integer. All the examples
>> tests and demos used this. The problem comes about because it was
>> unanticipated that someone would like to include actual data (ie a
>> date in this case) in a version #. Note that this is the first time
>> in 9 years that this has come up. So I think it's a little much to
>> characterize
>> it as a bug in the library. It would better be called an
>> unanticipated usage of the version #.
>
> IIUC, in 1.41.0 and earlier, the version number was an int. In
> 1.42.0, it is now 16 bits, which is a breaking change on just about
> every platform.

The version # has always been 16 bits. The binary archive has
always stored 16 bits for the version #. The code used an int -
whose size varies between 16 to 64 bits depending on the platform.
Text archives convert the int to a string and this conversion doesn't
trap when the number passes 16 bits.

>The responsibility of dealing with this archive
> format change surely lies with Boost.Serialization itself?

There is no format change in the library.

> Or do
> Boost.Serialization users need to know that archives they write are
> not necessarily readable by later versions?

Hmmm - storing a 32 bit integer in a value saved as a 16 bit
value (binary_archive) is not a good idea. I recognize that it
was not obvious when one did that and that it could work in
some cases - such as this users. That's exactly what the level
4 warning was telling me. So I fixed the code to suppress the
warning ! and here we are.

> I can't see much middle ground here - either you're backwards
> compatible, or you're not.

lol - no question about that.

>>> If not, how do you make changes
>>> to the archive format (e.g. the change David found in 1.42.0)
>>> without breaking old archives?
>>
>> This is described in the documentation. The version # is maintained
>> on a class basis and is completely independent of any other number
>> such as program or boost version. A little reflection should make
>> it clear why it pretty much has to be this way.

> I'm talking about changes within Boost.Serialization itself, not
> changes to user-defined types. The 32-bit-to-16-bit change that
> triggered this discussion, is a good example. How will
> Boost.Serialization in the future, know whether to read a 16 or 32
> bit version number, from an archive?

In this particular case, the situation is not that bad. This particular
code has only been tested with text archives. (It would break
immediately with binary ones). So the only issue is what
size should the version # be read into. Even here it's a specific
case as on a machine with a 16 bit int, the users code would
have already failed. I'm still thinking about this, but I can
see that reading the version # into an int rather than an int_least16_t
would solve his problem - though it wouldn't address the
other issues I've mentioned. I'll consider this for version 1.43.
This would permit him to load old archives.

1.42 will trap when a version # exceeds 16 bits. I wouldn't
expect this to change though. So the problem of how
use version # will have to be dealt with.

>If it always reads a 16 bit
> version number, then you've broken compatibility with all pre-1.42.0
> archives. If it always reads a 32 bit version number, then you've
> broken compatibility with 1.42.0.
>
> To deal with this, you really need to know which version of Boost was
> used to create the archive.

There is a mechanism for addressing these kinds of issues - it's the
library version # as described in the documentation. So far, that #
is up to 4.

>
>>>
>>> How would an application ever be able to exchange data with older,
>>> deployed, versions of itself, without this capability?
>>
>> Again, a little reflection will make it clear that an older version
>> of
>> a program can't anticipate changes in a subsequent version. I'm
>> sorry - it's just logically not possible. Think about it.
>
> Do you realize that e.g. Microsoft Word 2007 can be instructed to
> save files in such a way that they can be loaded with Word 2003? What
> is logically impossible about that?

Can Microsoft 2003 word load files created with Microsoft word 2007?
That is what we're talking about here.

The question of being able to create previous versions has been
discussed. In fact, there is a section of the documentation in
which this is discussed as a possible extension. It wouldn't be
all that hard to implement - but no one has shown any interest
in doing it.

>>> Your response to David seemed to be essentially "too bad, maybe you
>>> can find a way around it yourself", so I can't see that (1) is being
>>> taken very seriously.
>>
>> If I had an easy answer, honestly I would share it. Really. I
>> don't. Sorry.
>>
>
> Fair enough, but then it should be stated clearly in the
> documentation: "Archives created by one version of
> Boost.Serialization are *not* guaranteed to be readable by subsequent
> versions of Boost.Serialization.".

Hmmm - I might be willing to say
a) that the intention is to make such a guarentee
b) and every effort has been made to that end
c) and that every attempt has been made to anticipate the
usage of the library
d) and that the library has been in usage for many years
e) and that versioning is a widely used facility
f) that has had very few problems from users
g) and that continual efforts are being made to make that
guarentee stronger
h) but that it's possible that there is something I haven't
anticipated which will create a problem.

But I suppose that goes without saying.

>>> Of course... The point is that with a robust versioning scheme in
>>> place, archive format changes can be implemented without breaking
>>> older software.
>>
>> There is a robust (and efficient) versioning scheme has been in
>> place since the beginning. It was never designed to be able to hold
>> extra data. It's unfortunate that I didn't trap such an unintended
>> usage. I try really hard - but I haven't been able to trap every
>> case where something is used in a way that doesn't occur to me.
>>
>
> How can you call it robust? It is evidently not providing
> compatibility in either the backwards, or the forwards, direction.

Honestly, I can't help but wonder if you've read the documentation
or used the library.

Robert Ramey


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net