Boost logo

Boost :

Subject: Re: [boost] boost serialization backwards compatibility of binary archives broken
From: Markus Henschel (markus.henschel_at_[hidden])
Date: 2013-02-21 07:19:25


> -----Original Message-----
> From: Boost [mailto:boost-bounces_at_[hidden]] On Behalf Of Robert
> Ramey
> Sent: Mittwoch, 20. Februar 2013 23:01
> To: boost_at_[hidden]
> Subject: Re: [boost] boost serialization backwards compatibility of binary
> archives broken
>
> Markus Henschel wrote:
> > Hello,
> >
> > in its current state boost serialization isn't able to read binary
> > archives from older versions. The documentation states that there is a
> > problem with version 1.42-1.44 but this is not the whole story. My
> > archives from 1.34 don't work either. There are currently open bugs
> > for this:
> >
> > https://svn.boost.org/trac/boost/ticket/4903
> > https://svn.boost.org/trac/boost/ticket/5567
> > https://svn.boost.org/trac/boost/ticket/4660
> >
> > This situation has been like this for a very long time now. Comments
> > of users even suggest not using binary archives or switching to
> > another serialization library because of maintenance problems. I'm
> > posting to this list as it seems like no one cares about these bug
> > reports anymore. The least thing that should be done I think is to
> > update the release notes. There are even patches in the bug reports
> > that provide fixes although I don't know if they work for all
> > situations. The current situation is quite uncomfortable for me as I
> > have to patch every new boost version to get serialization working.
> >
> > What can be done to improve this situation?
>
> I've left these open because I don't have a definitive fix.
>
> This came about when I made some changes just to eliminate some warnings
> when used for some compilers. I considered these changes inconcequential
> (just fixing warnings after all) so I didn't bother incrementing the serialization
> library layout index in the header record. Well, fixing these warnings had
> some unintended side effects which changed the archive format. This
> prevented reliable loading of binary archives for previous version. Of course
> this took a while to see where this was coming from so even another version
> went by - this time with a number.
> Serialization library testing was sufficiently exhaustive to detect this in time
> and it took a while to pin down. Naturally I expect to be much more careful
> in the future.
>
> I'm very much aware that this creates a problem for some users but I've
> concluded that I can't make any change in the library that will really fix it.
>
> It's not totally hopeless however. Would the following procedure work for
> you?
>
> a) using boost 1.34 make small program which
> i) loads the binary archives into memory using boost 1.34
> ii) saves the data loaded above using text_archive or
> portable_binary_archive.
>
> b) using boost 1.49+ make a small program which
> i) loads the text_archive created in step a) above
> ii) saves the data loaded above to a new binary_archive.
>
> I believe this would permanently fix the problem.
>
> And I promise to be more careful in the future.
>
> A couple of misc notes on this subject.
>
> a) this situation occured because I made a mistake while updated the boost
> serialization library. But it could occur from other sources - example a
> compiler upgrade which changes the properties of some primitive datatype.
> So I would recommend using text or portable binary archives for storage to
> disk and reserver binary_archives for temporay archives.
>
> b) I would like to enhance and make more robust the archive aspects of the
> boost serialization library but I'm a little "gun-shy" after this experience.
> The main problem is that I don't a good set of tests for backward
> compatibility, making such tests would require a non-trivial effort, and I feel
> the serialization library already places a disproportionately large burden on
> the boost testing infrastucture.
>
> This discourages me from adding some necessary enancements like yaml,
> json archives and archives which create xml dtd/schema to permit editing
> with xml tools and archives which save/load to/from GUIs like mfc, qt,
> wxWindows, html5, etc.
>
> I do maintain the library and have relatively few issues because it's not hard
> to do. Unfortunately, the best I can do for your issue is what I've suggested
> above.
>
> I hope you appreciate that it is embarassing for me to confess to all this.
>
> Robert Ramey

Thank you for explaining this. The current documentation makes it very clear why archives from 1.42-1.44 cannot be loaded. The archive changed but the library version didn't so there is no way to know from which version an archive actually is. Nothing can be done about this other than the workarounds you suggested. I also understand that binary archives strongly depend on the compiler and it's settings and that there is no guarantee about compatibility with different compiler versions (although we had no problems so far) and I consider switching to a different archive anyway.

But that doesn't explain why binary archives created by boost serialization versions prior to 1.42 don't work anymore. I spent some hours with the svn logs and it seems like there is a bug in the code that tries to fix the compatibility issues in:
http://svn.boost.org/svn/boost/trunk/boost/archive/basic_binary_iarchive.hpp

Please have a look at change 64156: "Fix? for error in library version 6 - version types and class id types"

>From what I can tell this introduces a bug that breaks compatibility with older archives:

void load_override(class_id_type & t, int version){
    library_version_type lvt = this->get_library_version();
    if(boost::archive::library_version_type(7) < lvt){
        this->detail_common_iarchive::load_override(t, version);
    }
    else
    if(boost::archive::library_version_type(6) < lvt){
        int_least16_t x=0;
        * this->This() >> x;
        t = boost::archive::class_id_type(x);
    }
    else{
        int x=0;
        * this->This() >> x;
        t = boost::archive::class_id_type(x);
    }
}

It basically says that for archive versions smaller than 7 class ids should be read as int. I think you did this because you changed the typedef for class_id_type from int to int_least16_t. But you missed that there has been a save_override before in basic_binary_oarchive.hpp for the class id type:
    void save_override(const class_id_type & t, int){
        // upto 32K classes
        int_least16_t x = t.t;
        * this->This() << x;
    }

I patched basic_oarchive.hpp to look like this:

void load_override(class_id_type & t, int version){
    library_version_type lvt = this->get_library_version();
    if(boost::archive::library_version_type(7) < lvt){
        this->detail_common_iarchive::load_override(t, version);
    }
    else
   {
        int_least16_t x=0;
        * this->This() >> x;
        t = boost::archive::class_id_type(x);
    }
}

This works perfectly for me. I can load archives from 1.34 and from current versions. Is this fix wrong for any reason? If so, why?

Thanks,

Markus

P.S.:
I really appreciate your honesty. I think you have no reason to be embarrassed because of a bug. This is unavoidable. But what I find strange is that you leave the bug reports open. If you have the impression they cannot be fixed you could have just closed them with a comment in there that nothing can be done.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk