Boost logo

Boost :

From: Matthew Vogt (mvogt_at_[hidden])
Date: 2004-04-18 19:28:26


Robert Ramey <ramey <at> rrsd.com> writes:

> > I'm interested in
> > being able to save to proprietary formats (which often means that the
> > applications involved never got around to specifying a standard
> > format...). There must be thousands of ad-hoc binary formats in use
> > today.
>
> Hmmm - I'm a little wary of this. Though I'm not sure I know what you mean.

Yes, I had better explain here. In the case that I'm addressing, the format
is not a means to an end, but an end in itself. The goal of this effort is
to use the serialization library framework not to produce reversible
transformations between arbitrary C++ and a bytestream, but to take
targetted C++ objects and produce known binary representations.

> Some persons interested in the library have hoped it can be used to generate
> some specific format. But that format doesn't accommodated all the
> information required to rebuild and arbitrary C++ data structure so in order
> to do one ends up coupling the serialization of classes to the archive
> format - just exactly what the serialization library is designed to avoid.

I don't think it is a case of coupling, merely one of limitation. An object
that can be serialized into XDR format must use only a limited range of
C++ types in it's composition - but having accepted that limitation, it can
still be serialized into a less limiting archive type with the same code.

> Actually, My current thinking is to add a section to the documentation ( I
> love my documentation ! Many people have contributed to it by careful
> reading and criticism) which suggests a transition path from a proprietary
> format to usage of the serialization library. This transition would be
> basically
>
> a) make a program which loads all your "old" data in the "old" way.
> b) serialize the structures.
>
> So I'm skeptical of trying to adjust to "old" proprietary formats with the
> serialization library.

This is a transition from an old format to the brave new world, but in
the context of persistence. It doesn't apply in the case of serialization
for marshalling.

> >> 1) I wonder why you derived ordered_oarchive from
> >> basic_binary_oprimitive<Archive, OStream>
>
> > I'm using the save_binary / load_binary functions from
> > basic_binary_oprimitive
> > and basic_binary_iprimitive.
>
> Maybe we'll consider factoring this out into a standalone function. We'll
> keep and eye on this for now. In fact, if the salient feature of XDR is
> Endian awarness, alignment etc. I'm wondering if some of the functionality
> of your classes shouldn't be moved into your own version of load_binary
> thereby making inheriting the native one unnecessary.

I don't really see any need for this at this point. Once the data is the
correct binary format, one save_binary function is as good as another. I
don't think that inheriting extraneous 'save' members from the
basic_binary_oprimitive class is a major concern. Also, as you point out
later, I want to inherit any work regarding issues with streams, locales
and whatever else is going on way down low.

> > 2) there's some stuff in boost that addresses alignment in a guaranteed?
> > Portable manner that may relevant here. Sse #include
> > <boost/aligned_storage.hpp> . BTW - the best way to make your code
> portable
> > without cluttering up with #ifdef etc... is to use more boost stuff - let
> > other people clutter up their code with #ifdef all over the place.
>
> > Yes, but the aligned_storage template helps with platform-specific
> > alignment within the machine. I don't see how it helps with
> > platform-independent alignment within the content of the archive...
>
> OK - I would like to see that made a little more transparent and better
> explained with comments. It's an important part of the issues being
> addressed.

Ok, no problem.

> > 3) I'm curious about the override for the saving of vector. ...
>
> > I need to override the serialization of vector, because the vector must be
> > serialized with a known policy to yield a required layout in the archive.
> > The fact that this serialization happens (at this point in time) to be
> > exactly the same as the default implementation is not relevant - that can
> > be changed at any time, but the CDR, XDR and other binary formats must
> > not change.
>
> I'm still not convinced - its seems to me that it shouldn't need to be
> overridden for XDR and CDR which is what these classes do. What about list,
> deque, set, etc.

Well, I never thought these were necessary, remembering that I am providing
for classes, which are designed to be serialized into a particular binary
format. In a binary format, all collections will boil down to a group of
repetitions, which are either preceded by a length/count argument, or whose
length is a defined property of the format itself.

For me, vector has always sufficed, in either fixed_length<>, or
variable_length<> guise. Perhaps others have differing experience.

> > Yes, I see what you're saying. I'll have a think about this - but the
> > term 'marshalling' is not particularly prevalent in the code. Even if
> > the term does have broad application, I think I am using it in the
> > traditional sense.
>
> Your library does marshalling ( as understand the term is usually used ).
> My complaint is that its too modest. Your library does more than that.
> When I started this library there was strong usage of the term "persistence"
> which lead to the misconception that the library had nothing to do with
> "marshalling". I see serialization as use in a number of things -
> persistence, marshalling and who knows what else? (e.g. generating a crc on
> the whole data state of the program to detected changes). That's why I went
> to much effort to avoid this characterization of the library. Your addition
> will gain strength from leveraging on this and by fitting in with the
> established pattern will be found easier to use. This will make it more
> successful. Also by following such a pattern it will almost entirely
> eliminate the need for special documentation.

But there are inherent limitations in marshalling (per my usage). Pointers,
which are adeptly handled by the general library, are not valid elements
of a marshalled data set. Not all types can be represented in all formats.
These are aspects that need documentation, because they render the archives
fit only for marshalling, not for the more general 'serialization'. As is,
the library certainly supports the most general concept, but my ambitions
are more mundane.

Of course, higher level constructs can be implemented on marshalling
base archives. IIRC, 'IIOP' is the layer above CDR in CORBA, which provides
for remote object references, etc.

(I realise the need for documentation, this conversation would have been
simplified had it existed earlier.)
 
Matt


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk