Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2002-07-02 10:47:10


-----Original Message-----
From: Vladimir Prus <ghost_at_[hidden]>
Sent: Mon, 1 Jul 2002 14:35:11 +0400
To: boost_at_[hidden]
Subject: Re: [boost] serialization - request for formal review

Robert Ramey wrote:
> I did reply to your message when it was posted. I don't know what
> happened. (Acually I have difficulty replying to messages when the come in
> a batch). Anyway, below you will find your original post along with my
> original response. Sorry for the mix up.

Oh... looking again at your messages I realize that you actually responded,
but without quoting my message. I natually read only the part below my
signature. Sorry.

> > 1. Class registration and numbers.
> >
> > Archives store class numbers, and mapping from class names to numbers is
> > local to archive. Moreover, class names are never written to archive.
> > This cause problems with polymorphic classes.
> >
> > Suppose I write a vector of polymorphic pointers. Each concrete class is
> > assigned a number, but it is required to assure that the numbers will be
> > the same when reading. The only way to archive this in current system is
> >
> > out_ar << static_cast<Derived1* >(0) << ... <<
> > static_cast<Derived15* >(0) .....
> > Derived1* d1;
> > Derived15* d15;
> > in_ar >> d1 >> ... >> d15
> >
> > Here, saving/loading of a dummy pointer creates a consistent numbering.
> > This approach is not optimal
> > - it is not clear who should store/load dummy pointers, (if we don't
> > want to do it manually each time we work with archive.
> > - it might not be possible to do this storing/loading in one place:
> > it is possible that no piece of code knows the full list of derived
> > classes. - it is quite impossible that dummy pointers are written in the
> > same order, especially in different application.
> > (Please see the attached program for illustration of what happens if
> > ordering is not consistent)
> >
> > So, I conclude that this approach is very problematic. I view several
> > layers where serialization might works.
> > 1) For one application
> > 2) Between several applications on one platform
> > 3) Between several applications on different platforms.
> >
> > Current approach has problems with 1). I suggest that when writing class
> > for the first time, it's typeid(...).name() is also written. I believe
> > that MFC and (now forgotten) OWL (from Borland) used this approach. This
> > will make 1) and 2) work. As for 3), I'm not sure we can do anything
> > without portable type_info.
>
> I believe that you are wrong here. I originally thought that I had to save
> the typeid in the archive. I came to the conclusion that it was not
> necessary. Not specifying it overcame a significant obstacle to archive
> portability.
> The current system works for all known scenarios except circular pointer
> references. This includes all the cases cited above.
>
> The system is based on the fact that all components are saved and loaded
> in exactly the same sequence. This will always be true unless an error
> has been commited.

I disagree here. Let's consider polymorphic class B and

vector<B*> data;

In order to correctly save this data you'd need to register *all* classes
derived from B. As I say before, there two problems:
1. you have to repeat the registration for each place where you create
archive object.

true - but typically archives would be created in very few places.

2. the code which creates archive objects might not know the list of *all*
derived classes.

at some level, code will know what is the most derived object created.
this can be at a level much higher than the class that uses the polymorphic
pointer. Its hard to discuss this without examining specific instances.

I understand that it is inconvenient to to sometimes have to "register" classes
not otherwise seen by the archive. The decision to do it this way was a
pragmatic one based on the following considerations.

a) type_id(..).name is not guarenteed to be portable. In fact each compiler
I looked at had a dfferent way of rendering the name of a class. It wasn't
even always obvious how to do it. This would mean that the archive would
have to do one of two things

        i) know about how type_id(...).name is handled by all other compilers that
        might create archives and maintain code that does the translation to that
        used by the compiler loading the archive.

        or

        ii) write a routine to render the class name in some canonical form. This
        would mean doing some sort of "registration" for ALL classes used not
        just some of those used by polymorphic pointers

        Of course if there is a third alternative, I would be interested in hearing
        about it. But given these two alternatives, the choice for me is easy as
        I think it would be for most boost members. .

b) the need to "register" a class in a rare occurence in practical problems.
The map demo doesn't require it because classes are "seen" by the archive before
the polymorphic collection is used. This is a common (though admitedly not
a universal) situation.

c) the fact that this situation is detected when the archive is created - even in
a release build - Guarentees that no unreadable archives will be released.

By the way, the first version used typedef in the archive. The problems that
this engendered were pointed out by the usual suspects - astute boost members.
It was in addressing this objections that I came to the current system. I believe
it is the best solution and inspite of this minor inconvenience quite a good one

So, how do you propose to solve this problems, and why the solution can't be
incorporated in the library?

> > Also, as I understand, if a class never seen by the serialization code
> > is written, then type_info_extended::find will return 0 and assert will
> > fail. Prior versions had a way to register a class. I'd like to have
> > that way again in the library.
>
> writing a NULL pointer of the derived class to the archive is equivalent to
> "registering" the class in the previous system.

Yea, except that this method is not explicit and has the disadvantages I've
explained above.

would you be satisifed if the following if something like the following were added back in?

template<class T>
serialization::basic_oarchive::register()
{
        *this << static_cast<T *>(NULL);
}
template<class T>
serialization::basic_iarchive::register()
{
        T *t
        *this >> t;
        assert(NULL == t)
}

This would permit one to use
        
        save(...)
                ar.register<class>()

rather than just

        save(...)
                *this << static_cast<T *>(NULL);
                
Personally I don't think it adds anything. It might be more explicit though. In my
view this is a very minor question of esthetics.
        

> > 2. archive_exception should really be derived from std::exception
>
> I considered this and saw no benefit to doing so. How would this be an
> improvement?

I have exactly the same opinion as Kostya Altukhov: no need to have separate
handler for archive_exception.

See my response to his post.

I hope this helps explain my position.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk