Boost logo

Boost :

Subject: Re: [boost] Standard c++ XML parser API (Boost.XML)
From: Stefan Seefeld (stefan_at_[hidden])
Date: 2014-03-18 11:46:19


On 03/18/2014 11:19 AM, Bjorn Reese wrote:
> On 03/17/2014 10:03 PM, Stefan Seefeld wrote:
>
>> Just for the record: I have been collaborating with Saksham over the
>> last couple of weeks to refine ideas that could be cast into a formal
>> project submission.
>
> Where is the latest proposal?

I believe Saksham is working on a formal submission right now.

>
>> It would be really great if someone else would provide feedback, too.
>
> I would like to see how the proposal relates to other Boost libraries.
> For instance, how well does it integrate with Boost.Serialization or
> Boost.Fusion? Can it be used to replace the XML parser inside
> Boost.PropertyTree?

The idea is to provide complete but modular XML APIs. Complete in the
sense that it can handle any well-formed XML, and modular in the sense
that orthogonal pieces of functionality are kept separate so users can
pull in only the pieces they need.
I don't see any reason why such an XML API wouldn't be usable by other
Boost libraries.

>
>
> The remaining comments are related to the GitHub code, as I suspect
> that you want it to be used in the GSoC project:
>
> https://github.com/stefanseefeld/boost.xml

I suggest you look at that as a proof-of-concept, not something
finished. In fact, when I originally wrote that library, I used libxml2
as its only backend. However, doing that bears the risk that the entire
API will be closely tied to the idiosyncrasies of that one backend, so
adding support to more backends (such as Xerces) will help validate that
the API is in fact backend-agnostic.

>
> The code could be made easier to understand with documentation.

Agreed.
>
> Will it be possible to output streaming XML? (xmlTextWriter)

That's a nice idea.

>
> The DOM and Reader parsers assume that input is in a file. What if I
> want to process a buffer in memory?

Right, that should be possible. (I know libxml2 supports that, so at
least for that backend it seems trivial to add the missing wrapper.)

> What is the purpose of the S template argument?

To keep the concern for unicode or any other string type orthogonal from
the XML library, i.e. to allow Boost.XML to interact with different
Unicode implementations. In fact, in the existing demos I'm restricting
content to ASCII, so I can in fact get away with using std::string, so
this is a good example of the "modularity" design goal I mentioned
above: Don't force anything on users they don't actually need.

> What is the purpose of the convert trait?

To allow conversion between the backend's own string representation and
the string type that is used with Boost.XML.

>
> How are different XML encodings handled?

Can you ask more specifically ? I suspect the answer is that this is
handled by the Unicode component to which Boost.XML gets bound by means
of the string template parameter.

> token_base::get_token() returns information about the current token,
> but it seems to be invalidated (or updated to the new token) after
> calling parser::next(). Is that correct?

Yes.

    Stefan

-- 
      ...ich hab' noch einen Koffer in Berlin...

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk