Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2007-07-12 11:15:28


Petr Dimov wrote:
>Phil Endecott wrote:
>> overhead of about 150 bytes per node.

> After such an experience my thoughts would be less oriented towards changing
> the XML in-memory class and more towards refactoring the application to not
> build the entire XML document in memory.

Yes - but...
> this would likely mean no XSLT.
..which was exactly what I needed to do.

That brings up another question; which of these approaches do people prefer:

- Use a C++ XML library that runs on a libxml2 backend, so that it can
also use libxslt to do XSLT transformations.

- Use a standalone C++ XML library that is incompatible with libxslt,
and instead do XSLT-like transformations in C++.

This question brings us back a bit closer to the features of Stefan's
propsal (which I think could be extended to do XSLT using libxslt). I
think that if I were starting another project of the sort that I
described before I would probably avoid XSLT - with hindsight I
overstretched it. But what would ideal C++ XML transforming (or just
XML reading) code look like? As gchen writes: "creating xml does't
seem a big problem, but writing xml-reader [..] is a time-consuming
task". Boost.Spirit can match things; can we use something vaguely
like Spirit syntax to match XML fragments, and define actions to apply
to them?

rule input = * person;

rule person = element("person")(name, birth, father, mother)
                                              // meaning all needed but
in any order
               [ // this is a Spirit-style
"action" [].
                 return h::html[ h::body [ // these are
declarative-XML [].
                                              // h:: is an html element namespace
                   h::h1("Timeline for "+_1), // _1 refers to 'name'
above; did Spirit 2 add this?
                   _2, _3, _4
                 ] ]
               ];

rule name = element("name")(firstname, surname)
             [
               return x::textnode(firstname+" "+surname);
             ];

etc. etc.

Maybe that could be made to work, but writing out the example above has
made me a bit less optimistic about it. How would it compare with XSLT
in terms of capabilities, performance, syntax, and so on?

Here's another approach. Say I have

<library>
   <document>
     <author>..intersting stuff..</author>
     ...lots and lots of uninteresting stuff...
   </document>
   ...more documents...
</library>

and I just want to extract the authors' names. So, I start off by
parsing it into a tree of generic xml elements, and I then (somehow)
convert those element objects into element-name-specific subclasses.
Or maybe I parse directly into the subclasses, it doesn't matter.
These subclasses implement an extract_authors virtual method; for the
library and document classes, they recurse into their children, for
author it returns the content, and for all other subclasses it returns
without doing anything. So I can just call root.extract_authors().

Peter Dimov also wrote:
> Doing an XSL transform on a "virtual" document would require an abstract node interface that you
> implement on top of your existing data to provide an XML view for it

I wonder if any serialisation or introspection experts have any
suggestions? I think someone else has also mentioned using XPath-like
expressions for exploring non-XML tree structures.

Regards,

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk