Boost logo

Boost :

From: quendezus (quendez_at_[hidden])
Date: 2002-01-28 07:50:35


--- In boost_at_y..., Matthias Troyer <troyer_at_i...> wrote:
>
> I agree completely. We need to support a variety of data formats
> to interface with other applications. In our case we need
> text-based formats for testing and debugging, binary formats
> based e.g. on XDR for interplatform-compatibility, and other
> formats such as HDF-5 for interoperability with other programs.
> In addition we use it for message passing. Thus support of a
> variety of formats is essential. Versioning is also important,
> as formats do indeed evolve over the years.
>
> On the other hand a default format for binary files will also
> be an excellent idea.
>

Are we still talking about type info formats? Here is my
understanding of how serialization and persistence interoperate with
data formats:

1- (static) Serialization

Each object loads and saves itself. Thus, the choice for a format is
up to the user and can be different for each class. The user can save
binary data, text data, XDR, raw "linear" data, chunks, serialized
map to indexed object fields, data with version info, etc.

All the framework can do is to provide help tools, for example to
solve the big Indian/little Indian problem:

   in_storage& operator <<( in_storage& s, int i )
   out_storage& operator >>( out_storage& s, int& i )

But it is also fine if the user choose the following raw data
functions to save its integers:

   void write( in_storage& s, const unsigned void* data, int size )
   void read( out_storage& s, unsigned void* data, int size )

After all, there are many cases where you don't need your binary to
be cross-platform.

2- Simple use of (static) serialization: basic file format

When I want a simple load/save mechanism, I can choose a very simple
file format. For example, given that CComputer is a terminal concrete
class:
   Number of CComputer objects (int)
   Serialized CComputer object 1
   Serialized CComputer object 2
   ...

The file format does not say anything about what is in 'Serialized
CComputer object n', it is only about the overall structure of the
file.

`File format' is not a well chosen name, because data can go to other
things than files. What about flow format? Stream format?

3- Persistence (dynamic serialization)

Now we want to dynamically create the objects before deserialization.
Thus, we need a type info format in addition to the file format. If
the type info format is simply a string, an example of basic file
format can be:
   Number of objects (int)
   Type info of object 1 (string)
   Serialized object 1
   Type info of object 2 (string)
   Serialized object 2
   ...

There can be a version info in the header, but this version info will
describe the file format or the type info format, not the
serialization algorithm used by each object (they can all be
different).

4- Final word

When we say that a persistence library should be able to deal with
many formats (given that it is a requirement), we talk about file and
type info formats, not how objects serialize themselves. Do we all
agree with that or am I missing something?

Sylvain


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk