Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2002-06-03 10:16:18


Robert Ramey wrote:
> Fellow programmers:
>
> I have just uploaded the third release of my proposed serialization
> library.

Hi, Robert!
It's good to know that you continue working on serialization library.
I've tried it yesterday in several ways, and found only one significant
non-interface problem, but it might require some discussion. Below are my
observations.

1. Class registration and numbers.

Archives store class numbers, and mapping from class names to numbers is
local to archive. Moreover, class names are never written to archive. This
cause problems with polymorphic classes.

Suppose I write a vector of polymorphic pointers. Each concrete class is
assigned a number, but it is required to assure that the numbers will be the
same when reading. The only way to archive this in current system is

    out_ar << static_cast<Derived1*>(0) << ... << static_cast<Derived15*>(0)
    .....
    Derived1* d1;
    Derived15* d15;
    in_ar >d1 >... >d15

Here, saving/loading of a dummy pointer creates a consistent numbering.
This approach is not optimal
   - it is not clear who should store/load dummy pointers, (if we don't want
     to do it manually each time we work with archive.
   - it might not be possible to do this storing/loading in one place: it is
     possible that no piece of code knows the full list of derived classes.
   - it is quite impossible that dummy pointers are written in the same
     order, especially in different application.
(Please see the attached program for illustration of what happens if ordering
is not consistent)
   
So, I conclude that this approach is very problematic. I view several layers
where serialization might works.
1) For one application
2) Between several applications on one platform
3) Between several applications on different platforms.

Current approach has problems with 1). I suggest that when writing class for
the first time, it's typeid(...).name() is also written. I believe that MFC
and (now forgotten) OWL (from Borland) used this approach. This will make
1) and 2) work. As for 3), I'm not sure we can do anything without portable
type_info.

I believe that you are wrong here. I originally thought that I had to save the
typeid in the archive. I came to the conclusion that it was not necessary.
Not specifying it overcame a significant obstacle to archive portability.
 
The current system works for all known scenarios except circular pointer
references. This includes all the cases cited above.

The system is based on the fact that all components are saved and loaded
in exactly the same sequence. This will always be true unless an error
has been commited.

Also, as I understand, if a class never seen by the serialization code is
written, then type_info_extended::find will return 0 and assert will fail.
Prior versions had a way to register a class. I'd like to have that way again
in the library.

writing a NULL pointer of the derived class to the archive is equivalent to
"registering" the class in the previous system.

2. archive_exception should really be derived from std::exception

I considered this and saw no benefit to doing so. How would this be an improvement?

3. I'd prefer if documentation describe the format of text archives. The
description is already in code, so I guess it would be easy to add it to docs.

What benefit would this give us? It turns out that the text archives are
pretty much indecypherable to humans exept for embedded strings. This is
due in large part to the filtering out of redundancy. I believe it would be very
hard to describe in english.

I hope that these issues can be resolved soon. I have some interface
criticism ready ;-)

- Volodya

Currently I have five pending issues to address before requesting a formal
review.

a) the shared_ptr example in the reference is wrong. It will be replaced with
a similar example based on std::auto_ptr.

b) functions will be added to basic_iarchive and basic_oarchive to reinitialize
the archive and request deletion of all objects created by the archive. This
is needed to better handle clean up when an exception is encountered
durring archive loading.

c) Documentation will be altered to better qualify the term "portable" as it
applies to archives. An objection was raised about the fact my usage
of the term "portable" s

d) I want to build on more compilers. I currently use MSVC 7.0 and have
build the library with gcc 2.95. I have demo copies of Intel, and borland
and want to build with gcc 3.1 as well. I am having trouble getting
all these compilers installed. I would much appreciate it if people using
these compilers would attempt to build the current version of the library.
i.e. Draft # 3

e) the next source version will have tabs expanded to spaces.

I anticipate uploading my final version in a matter of days and requesting a
formal review.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk