Boost logo

Boost :

From: Andreas Pokorny (andreas.pokorny_at_[hidden])
Date: 2005-11-07 18:09:03

On Mon, Nov 07, 2005 at 09:46:24AM -0800, Robert Ramey <ramey_at_[hidden]> wrote:
> > but I would like to see at least these different APIs:
> > * on-demand parsing, a parser drived by a cursor, that allows to
> > navigate through a document, without loading it completly (I dont see
> > a need for prior validation, here)
> The serialization library does exactly this with the spirit xml parser.
> ...

I always assumed that spirit has full control over the parsing process,
so the parse() function itself is the driving force to walk towards the
end of input. The above really was about jumping through the file
(provided that the file is well formed xml) and only examining the
chunks around the interesting data fields. Maybe I am a bit to
optimistic about parsing xml files :).

> > * direct mapper of c++ structures to a certain format, so a kind of
> > xml serialization,
> I'm not sure how this differs from the xml serialization already in the
> serialization library.

The boost::serialization archives tries to encode C++ objects in data streams,
that might be xml documents as well. So the archive defines the format,
and adds suficient meta information to be able to recreate the objects
with the aid of the meta data found in the serialize-functions.

I was talking about a use case in which a certain known XML format has to be
mapped onto C++ structures. Thus the C++ structures already got designed
to represent the model which is discribed by that certain XML format.
Still the binding between the format ought to be separate.

The purpose of that archive is to provide persistence for the objects,
while the purpose of the binding library is to provide a high level
document parsing system.

I really considered boost::serialization for that binding library, but I
found the NVP-system not flexible enough to represent all xml
possibilities. As far as I understood, every complex type gets converted
into an xml node, while every primitive leaf type get converted into an
attribute. That restriction is too hard to represetnt the full xml format
space. E.g the model
struct some_node { std::vector<int> data; };
could be encoded like this:
<some_node >
  <data value="1"/>
  <data value="41"/>
  <data value="2"/>
  <data value="9"/>
<some_node >

Both variants are representable in the binding library I talked about.

To extend the binding idea one might add the possiblity to map only a certain
view on the XML format onto these C++ structures.

> I'm exactly sure what you have in mind, but I can see the need for a program
> which reads and xml schema and generates C++ data structures which
> can be navigated with previously compiled code modules. I think this
> would be fairly easily achieved using spirit xml parsing.
> As you can see, I think spirit is underrated. It IS hard to learn - and I'm
> in no way an expert but I have managed to find it very useful.

Aggreed. My binding library used libxml2 and later expat for parsing,
but I initially planed to use spirit as backend.


Andreas Pokorny

Boost list run by bdawes at, gregod at, cpdaniel at, john at