Boost logo

Boost Users :

Subject: Re: [Boost-users] [serialization] class versioning changes in boost 1.42
From: David Raulo (david.raulo_at_[hidden])
Date: 2010-02-22 16:15:32


On Mon, 22 Feb 2010 09:57:58 -0800
"Robert Ramey" <ramey_at_[hidden]> wrote:

> The change was made to make the system more robust and portable.
> Other changes were made to suppress warnings which indicated
> potential problems.

I understand that ints as such are not portable.

> Your system presumes that an int is 32 bits.

Well, to this day and age, we assumed it was at least 32 bits.
An unfortunate mistake...

> This assumption would
> make otherwise portable archives not portable to C implemenations
> whose int is 16 bits. The version # was always assumed to be exactly
> that.

Are such platforms so widespread nowadays? And using boost libraries?
Besides, I would assume making version_type a typedef for uint32_t
instead would work everywhere, if inducing some overhead on 16-bits
cpus?

> I didn't anticipate that this might be overloaded with some
> other data - like a date. Had this occurred to me, I would have
> advised against doing this in them documentation.

I would not call our scheme "overloading". We still use these as
version numbers, increasing with each new alteration of the classes,
independantly for each class. Instead of dates, we could have used
code-wide version numbers, ala subversion. All we needed was numbers
representing a single point in time globally for all of our classe.
Using this kind of numbering has several maintenance advantages, or so
I think.

For example, our early releases did not use versioning at all, since
this simply was not a requirement initially. So we have old archives
specifying 0 for the version of all their classes. A couple of releases
later, we introduced backward compatibility at our clients demand (some
of them clung to old releases just to be able to reuse their old
archives). To this end, our load() methods contain code similar to
this :

  template<class Archive>
  void load(Archive& ar, SomeClass& o, const unsigned int v) {
    unsigned int version = v;
    if (version == 0)
       version = archive_creator_release_version;
    if (version >= 20080517 && version <= 20080903)
    ...
  }

where archive_creator_release_version is the date at which we released
the software that generated the archive we're trying to reload (this is
saved at the beginning of every archive). To write the compatibility
reloading code, we only need to look at our code repository history,
and add some of these special-cases to our load() methods.

If we did use "classical" version numbers, this would have been very
difficult to do (at the very least, we would need to store a map of the
versions used for each class for each of our releases, and use that
when encountering an archive with a 0 version). And a single mistake in
a release would have been a lot more difficult to fix afterwards. What
do you do when you discover that there are in fact archives which used
a variant of SomeClass between version 4 and 5?

Anyway, using the release date as the default version for all classes
when reading back old archives proved very usefull. There are other
cases were using dates as version helped maintain our code, but this
message is too long already, and I'm not sure I explained the first
case clearly enough ...

> I realise that this is of cold comfort to you. Here are some ideas.
>
> a) You could tweak your copy of the library to use a 32 bit integer.

Doable, but inducing difficult constraints. We'd really prefer that
our clients be able to link to both official boost releases, and our
libraries at the same time. We link dynamically to boost for that
reason.

> b) in the longer run, you could decline to use the versioning built
> into the system and substitute your own based on dates. This would
> entail tweaking all your serialization implemenations to save your
> "date version" as part of the data.

That would be an awfull lot of code rewriting, and would not help
with reading back old archives unfortunatly.

> Sorry I can't be of more help. Moral of the story is "Overloading
> data leads to unanticipated consequences"

I did not realize the unsigned int version was intended to be 8 bits,
or I would not have done that. Too late now. Hopefully there is
another way out we did not think of yet...

Thanks for your response in any case,

-- 
david

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net