Boost logo

Boost :

From: brangdon_at_[hidden]
Date: 2001-12-07 09:57:13


In-Reply-To: <9un3ec+ljqf_at_[hidden]>
On Thu, 06 Dec 2001 06:32:44 -0000 emacsuser (f-boost_at_[hidden]) wrote:
> So but anyhow... those of you who feel that you have or might have
> need of an object serialization system, please speak up with your
> requirements.

* It should provide for evolution of the data structure.

This is hard to do well, and potentially open-ended. Most of my experience
is with Microsoft's MFC, which gets it very wrong. For example, MFC
identifies class versions with schema numbers, but stores one schema per
class instead of one per object. This means base classes can't evolve
independently of derived classes.

There also needs to be facilities to mess with the objects when loading.
For example, I might start with a Circle class and later rename it to
Ellipse. I need a mechanism to convert Circles to Ellipses when loading
old files.

Another example is reference counting. The MFC CArchive class holds
references to the objects its loaded, but doesn't know about reference
counting so its references are uncounted. This naturally causes problems.
There is no easy way to fix it.

* Const correctness

We should be able to stream-out const objects, and ideally stream them in
as well. This means that the operator>>() approach should not be required.
I should be able to write:

    class Derived: public Base {
        const int data;
    public:
         Derived( boost::iserialise &is ) :
                  Base(is), data(is.read<int>()) {
         }
         //...
    };
    
    const Derived example(is);

or similar.

> * Doesn't waste bytes.

Alas, support for data evolution consumes bytes. The Java approach
consumes *lots* of bytes; it includes a fullish description of the class
layout, with the names of instance variables etc. This is probably going
too far. We can get a long way with just one extra byte per class.

If you really care about bytes, I have found it can be worth including
support for variable-length encoding of integers at the serialisation
level. For example, if an int holds the value 100 it can be stored in a
single byte rather than 4. Doing this at the serialisation level tends to
be quicker and give better results than compressing the archive as a
separate step. Often the variable-length encoding is quicker than a
fixed-length encoding, because it leaves fewer bytes to transfer around
later.

Dave Harris


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk