Boost logo

Boost :

Subject: Re: [boost] [GSOC] XML library of Boost
From: Roger Martin (roger_at_[hidden])
Date: 2013-05-09 19:24:31


This is a fun topic. How should c++ play 'catchup' to other languages
on xml handling.

What applications will develop from such an XML API? Xml editors and
xml creators/modifiers? Data flow and communications between apps, web
services?
What can be leveraged in c++ to do something new/faster with xml? If
there was a way to dynamically load a shared library(compiled at
runtime) at run time then some pretty nitfy things could be achieved
with metaprogramming and expression templates.

I'm not sure there are any strong backend candidates to provide
satisfaction to c++ developers and users at this time but there has to
be needs besides mine.

Xerces is poor at large xml documents. As far as DOM is rearranging xml
elements/attributes being pursued? http://xalan.apache.org/ is xslt 1.0
and after 2.0 noone wants to go back to 1.0.

Binding is an important area for me. xmlbeanscxx which is based on
Xerces couldn't satisfy for binding(because the underlying DOM wasn't
helpful in the task of binding) data into my applications. Xml schema
constraints are a must for binding. The
http://sourceforge.net/projects/pion/ could really use a binder inside
it's RESTful web service. In other languages compact http://relaxng.org/
is getting addressed too.

I just saw http://code.google.com/p/xplus-xsd2cpp/ recently and have yet
to test it. (If you do try it, do so outside of any of your own code and
in its own folder)

To give examples, I use cml, mathml, graphml, svg, bibtexml and a number
of custom xml formats. Each of these have their quirks and are
difficult to bind.

Haven't tried http://vtd-xml.sourceforge.net/ for a while because its
license doesn't work for my company. With custom code I've been doing
something similar for simply reading data from xml documents.

On 05/09/2013 10:26 AM, Stefan Seefeld wrote:
> Bjorn,
>
> we are going in circles, which is in part because we still are talking
> past each other.
>
> In particular, it seems you aren't distinguishing between users and
> developers.
>
> On 05/09/2013 06:00 AM, Bjorn Reese wrote:
>> On 05/08/2013 02:08 PM, Stefan Seefeld wrote:
>>
>>> You are evading the question. A user may not even care how boost.xml is
>>> implemented, as long as the functionality is there. If I'm such a user,
>>> I don't want to be confronted with the question of what backend to pick.
>> Then create a 'boost-xml-standalone' package without dependencies, and
>> let the 'boost-xml' package depend on the 'boost-xml-standalone' and
>> 'libxml2' packages. Problem solved.
> Sorry, what problem is solved ?
>
>>> Right. But again, I think you are making life much harder than it needs
>>> to be for users. As a user I want to use the boost.xml library in my own
>>> project. Do you really anticipate there to be a bunch of different
>>> backends being offered to end-users to pick from, depending on what
>>> functionality he requires ? What a drag ! Just give him a a single
>> I thought that this was part of the GSoC proposal, which states:
> [...]
>
> You are citing out of context. Implementing multiple backends has many
> benefits for *developers*, for example as it helps to guarantee that the
> API isn't tied to a particular backend. It should not affect in any way
> *users*, who will only use the boost.xml API (and library), without any
> concern for any particular implementation choice.
>
>> Having said that, with the proper defaults, the user do not have to do
>> anything. Only if he wants to do something different does he need to
>> include another header, pass an extra argument, or whatever. This is
>> how the rest of Boost handles variation. Why has this suddenly become
>> much harder?
> It hasn't, and when expressed that way, I actually agree. What I don't
> agree with is this:
>
>> Start with an XML lexer. This simply returns the next token (start tag,
>> attribute, data, etc.) when called.
>>
>> Put the XML lexer in a loop, and you get a SAX parser.
>>
>> Pair the XML lexer with a parent stack, and you get an XmlReader.
>>
>> Base the DOM parser on the SAX parser to create its tree. This is how
>> libxml2 does it, and how it reuses the tree generator for parsing other
>> formats such as HTML and DocBook.
>>
>> By default, I would provide our own tree, although this is not terribly
>> important.
> While the layering you describe pretty much matches a typical
> implementation, this doesn't have any consequences for users, as these
> layers can't be exchanged. You can't mix a layer from one backend and
> combine it with another layer from a different backend. So why care, on
> an API level ?
>
> I believe your point was that you want to be able to implement only the
> "XML lexer", but neither the SAX nor DOM APIs, and still be able to call
> the result "boost.xml", yes ? I still think this is a bad idea.
> Otherwise, as long as the full functionality is provided, I don't care
> about the implementation, and in particular, whether someone will fancy
> to rewrite it "natively" instead of building on top of existing
> third-party libs.
>
> Stefan
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk