Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2002-11-21 10:45:55


From: Beman Dawes <bdawes_at_[hidden]>
>
>At 10:01 PM 11/18/2002, Robert Ramey wrote:

>>>Is there a reason you sent this to me privately?
>>> From: David Abrahams <dave_at_[hidden]>
>
>>>I believe your assessment that some
>>>data structures can't be represented using XML is incorrect, and
>>>that's easy to prove. A serialization library which makes generation
>>>of XML output difficult is severely handicapped in the modern world.
>>
>>Well, I have conceded that it was preliminary. All I know about XML
>>is from a small book containing a concise description of XML.
>>
>>My skeptism is based on the following thought experiment:
>>Suppose on is given a list of polymorphic pointers, some of which
>>correspond to bottom node of a diamond in heritance structure
>>and some of which are repeated in the list and serialized
>>some where else as well.
>>
>>a) How would such a thing be represented in XML?
>>b) Could be loaded back to create an equivalent structure?
>>c) Would it be useful for anything other than this serialization system?
>>
>>>If someone can assure me that the answers to all three of the above
>>is yes then it should be possible - otherwise not. Given that its
>>"easy to prove" these questions should be easy to answer in
> >a convincing way.

>Robert,

>I think you may be missing several points with your thought experiment:

>* The serialization library doesn't have to figure out how all C++ data
>structures (such as in your thought experiment) would be represented in XML

My question is whether XML can capture an arbitrary C++ structure in a
meaningful and useful way. So far no one has presented any XML that
captures that one proposed example.

>or any other format. Instead, all that serialization has to supply is a
>base class with the default hooks for prolog, epilog, separator, data, and
>similar functions.

Well, I don't know that. In general it is extremely difficult to know ahead of
time what facilities a serialization library would need to be permit an XML
archive to be generated. One would have to take a the library, make
changes necessary to provide the desired result and check to see
what changes are necessary.

>It is up to the user to customize for a particular
>format, beyond a few basic ones supplied by the implementation.

>* Some approaches, including XML, allow a practically unlimited number of
>different ways to represent the same data. The user rather than the
>serialization library should choose the particular design.

>* Some formats may not be able to support all C++ data structures, and that
>is okay. For example, the comma separated value (CSV) format used by many
>desktop tool programs won't extend much beyond arrays of simple structures.
>That doesn't mean the format is useless and or that it shouldn't be
>supported. It just means it isn't suitable for all tasks.

It does mean that its not suitable for capturing the structure of an arbitrary C++
data structure in a way that it can be restored. It may be useful, but its not serialization.

One of the main features of this system that has permitted to cover and even
surpass the original set of requirements is the decoupling of different aspects.

In the current system the following concepts are orthogonal

a) The description of the which data should be saved for each class (save/load/version)
b) composition of the above to handle arbitrary C++ data structures (serialization)
c) description of how fundamental types should be encoded as a byte stream
into a storage medium (archive)

Assuming that the questions in my "Thought experiment" could be answered
in the afirmative. What would have to be added to this system to permit it to handle
XML.

Another concept has to be added - that of reflection. A useful XML representation
needs the name of the variable. So some system needs to be designed to
hold that information and keep it related to each serializable member. Presumably
this would be a orthogonal concept d)

Given this, without too much effort and maybe adding some virtual functions to
archive one could add begin/end tags to archive. Of course many would object
to this on efficiency grounds but it would be possible. But things start to appear.
What about versioning? where does that fit into XML? But what about
pointers, inheritance, etc. to properly capture this in XML one would have
to start altering b) . Its the automatic composition that guarentees that this
system can serialize/deserialize any C++ structure. I doubt this would be
worth it.

At the heart of the matter is that serialization, XML, CSV, Databases, etc. are
designed for different and conflicting purposes. After looking at XML and understanding
as well as anyone what it takes to make a serialization system, I made
a preliminary determination that trying to include XML would result in a system
that compromised the other goals. The discussion on the list
in reference to this topic confirms my preliminary conclusion.

Of course, anyone is free to the the current serialization system and experiment
to see what it would really take to accomodate XML. (After all, its should be easy
if I'm wrong). But won't be me.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk