Boost logo

Boost :

From: Matthias Troyer (troyer_at_[hidden])
Date: 2002-12-17 16:02:35


Dear Robert,

thanks for your comments to my posting

On Tuesday, December 17, 2002, at 06:58 PM, Robert Ramey wrote:

> From: Matthias Troyer <troyer_at_[hidden]>
>
>> 5) "Versioning": [snip]
> overhead for version number is 1 or 2 bytes per class definition.
> tracking the classes so far serialized is not expensive. What
> is expensive is tracking all the objects serialized so that pointers
> can be correctly handled.

I was more concerned with runtime overhead, but this needs to be timed.

>
>> I see a two-pronged approach as the best solution:
>> a) both per-archive versioning, per-class versioning and no
>> versioning
>> should be supported for compatibility with other formats (issue ii)
>> above)
>
> per-archive versioning can easily be handled by appending to a
> default preamble.
>
>> b) if per-class versioning is used, it should be possible to turn it
>> off for some classes by a traits class - this will get rid of the
>> overhead (issue i) above) when versioning is turned off for a UDT.
>
> hmmm - I will have to think about this. Using MFC one has to take
> extra steps to include versioning. On my last commercial
> project using MFC serialization I didn't do this on some
> classes because I was assured that "that class will never change".
> Of course it did after the first version of the application shipped
> and ended up creating a lot of extra work. So I resolved that
> I would just "spend" the on byte per class definition and be done
> with it.
>
> similar logic applies to the archive preamble. My original modivation
> was the concern that existent archives never become obsolete
> by improvements in code - including the archiving systems. So
> I needed a version for the archive system itself - hence the preamble.

I agree, but want to make these optional for backward compatibility to
older archive formats.

>
>> 6) "Advanced functionality": [snip]
> This analysis is in general correct. Bookkeeping for objects that may
> be serialized as pointers is inherently expensive. And the current
> system doesn't provide a clean way to skip this book keeping for
> objects that are know never to be serialized as pointers.
>
> Lately, I have been be cleaning up the implementation along the lines
> suggested by G. Rozenthal. My intention was to make the library
> more "provably correct" and "logically transparent". I didn't forsee
> any change of functionality. However, as things get moved around
> to a more logical organization, certain things sort of mysteriously
> appear. In particular, the current library skips pointer bookkeeping
> for fundamental types. In the future the types for which the book
> keeping will be skipped will be alterable by the user similar to the
> manner which you suggest. I believethat you will find that this
> addresses
> your concern in a natural and complete way.

Thanks

>
> A really, really fundamental issue in the submitted library is
> the usage of "Archive" as a virtual base class. This is the
> traditional way of separating interface from implementation.
>
> Advantages
> ========
> a) we're used to it
> b) it permits total separation of UDT serialization specification
> from archive implementation. UDT serialization specifications don't
> even have to be recompiled for different archives.
> c) logically decouples UDT serialization concept from archive
> implementation concept.
> d) permits any UDT serialization implementation to work with
> with any archive implementation
> e) less compile time dependency - implies simpler code and
> faster compilations.
>
> Disadvantages
> ===========
> a) Does not permit archive implementation and UDT serialization
> to be coupled. This is the fundamental obstacle to serialization
> in XML format.
> b) virtual functions incurr some extra overhead in calling

The overhead of b) should be important only when large collections of
data are serialized. As these are often array-like, I propose the
additional virtual functions for contiguous arrays of basic data types
>
> A newer way would be to use template specialization rather than
> virtual base class to implement the interface / implementation
> paradigm
>
> Advantages
> ========
> a) Permits archive implementation and UDT serialization to be coupled
> thereby permitting archives to be "smarter" and facilitating
> implementation
> of something like XML.
> b) not virtual function call over head
>
> Disadvantages
> ==========
> a) we're really not used to it yet
> b) requires coupling of archive and UDT specification. This can make
> the
> system harder to understand and use in simple cases. System
> requires recompilation of the everything for every combination
> of UDT and archive used in a program.
> c) significantly larger executables
> d) much longer compile/build times

I think we have one more disadvantage:
e) will not be able to deal with polymorphic objects, since there are
no virtual template functions

or is there a way?

> In the submitted library, I chose option 1 primarily because of a)
> Whether or not this is the best choice really depends on the other
> factors
> mentioned above so I don't see an obvious answer here. In fact, for
> most
> situations either would work just as well.

I agree with that choice unless somebody shows me how a polymorphic
class can be serialized by calling a "save" function of the base class
in the second approach.

Best regards,

Matthias


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk