|
Boost : |
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2007-07-12 11:15:28
Petr Dimov wrote:
>Phil Endecott wrote:
>> overhead of about 150 bytes per node.
> After such an experience my thoughts would be less oriented towards changing
> the XML in-memory class and more towards refactoring the application to not
> build the entire XML document in memory.
Yes - but...
> this would likely mean no XSLT.
..which was exactly what I needed to do.
That brings up another question; which of these approaches do people prefer:
- Use a C++ XML library that runs on a libxml2 backend, so that it can
also use libxslt to do XSLT transformations.
- Use a standalone C++ XML library that is incompatible with libxslt,
and instead do XSLT-like transformations in C++.
This question brings us back a bit closer to the features of Stefan's
propsal (which I think could be extended to do XSLT using libxslt). I
think that if I were starting another project of the sort that I
described before I would probably avoid XSLT - with hindsight I
overstretched it. But what would ideal C++ XML transforming (or just
XML reading) code look like? As gchen writes: "creating xml does't
seem a big problem, but writing xml-reader [..] is a time-consuming
task". Boost.Spirit can match things; can we use something vaguely
like Spirit syntax to match XML fragments, and define actions to apply
to them?
rule input = * person;
rule person = element("person")(name, birth, father, mother)
// meaning all needed but
in any order
[ // this is a Spirit-style
"action" [].
return h::html[ h::body [ // these are
declarative-XML [].
// h:: is an html element namespace
h::h1("Timeline for "+_1), // _1 refers to 'name'
above; did Spirit 2 add this?
_2, _3, _4
] ]
];
rule name = element("name")(firstname, surname)
[
return x::textnode(firstname+" "+surname);
];
etc. etc.
Maybe that could be made to work, but writing out the example above has
made me a bit less optimistic about it. How would it compare with XSLT
in terms of capabilities, performance, syntax, and so on?
Here's another approach. Say I have
<library>
<document>
<author>..intersting stuff..</author>
...lots and lots of uninteresting stuff...
</document>
...more documents...
</library>
and I just want to extract the authors' names. So, I start off by
parsing it into a tree of generic xml elements, and I then (somehow)
convert those element objects into element-name-specific subclasses.
Or maybe I parse directly into the subclasses, it doesn't matter.
These subclasses implement an extract_authors virtual method; for the
library and document classes, they recurse into their children, for
author it returns the content, and for all other subclasses it returns
without doing anything. So I can just call root.extract_authors().
Peter Dimov also wrote:
> Doing an XSL transform on a "virtual" document would require an abstract node interface that you
> implement on top of your existing data to provide an XML view for it
I wonder if any serialisation or introspection experts have any
suggestions? I think someone else has also mentioned using XPath-like
expressions for exploring non-XML tree structures.
Regards,
Phil.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk