Boost logo

Boost :

Subject: [boost] [serialization] automation of class version sanity checking
From: troy d. straszheim (troy_at_[hidden])
Date: 2009-03-04 09:39:36


Hey (Robert),

Use case. It is possible (and does happen) that a user tries to
deserialize version N+1 of a class C with software that only understands
versions up to N. What typically happens is that C appears to
deserialize OK, but it has eaten too many (or few) bytes from the
archive. This means that the next class that gets loaded from the
archive will fail in some unpredictable way.

To deal with this we've started adding the following to all of our classes:

struct C
{
   double x; // this was present in version 0
   double y; // this arrived with version 1

   template <typename Archive>
   void
   serialize(Archive& ar, const unsigned int version);
};

BOOST_CLASS_VERSION(C, 1);

template <typename Archive>
void
C::serialize(Archive& ar, const unsigned int version)
{
   if (version > 1)
     throw unsupported_version(version, 1);

   if (version == 0)
     ar & x;
   if (version == 1)
     ar & y;
}

( Note the two arguments to unsupported_version ).

This has the pleasant side effect of often catching errors in *other*
classes. Eg if I'm trying to deserialize C and that exception is
thrown, you can generate a message "trying to read version 3243856293 of
class C, but I only know about version 0-1". Now I know that something
has gone quite wrong, for instance someone has removed a member from my
enclosing class without correctly incrementing and handling the version
there, and what I've been given as 'version' is garbage. If the message
is instead "trying to read version 4 of class C but I only know about
versions 0-3", it looks like I simply need to update my software,
nothing more sinister. This greatly simplifies these situations, which
otherwise could spiral into a long, painful back-and-forth over a
segfault in a serialization routine involving questions about what
software the user has, where the data came from, what software was used
to write it, etc.

So:
1. I can't think of a situation where having file_version greater than
  BOOST_CLASS_VERSION isn't a catastrophic error.

1a. It'd be nice to configurably auto-check for this kind of versioning
error.

2. As far as I can tell, these checks could be done cleanly inside the
library, in serialize_adl() (?).

3. We don't actually use that unsupported_version exception, which
takes no arguments and is thrown from serialization of variant (I wrote
that and now find it confusing). I think the variant error should be
specific to variant (or just totally generic), and unsupported_version
should be used for the case currently under consideration... and to
enable the 'pleasant side effect' above, I think the exception should
carry the greatest version known to the code and the attempted-to-read
version.

Sanity check please, say so if you want me to try implementing...

-t


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk