Boost logo

Boost :

Subject: Re: [boost] [GSOC] XML library of Boost
From: Stefan Seefeld (stefan_at_[hidden])
Date: 2013-05-01 11:15:30


On 2013-04-28 14:08, Amos Ji wrote:
> I've scanned the idea page. The ideas in the page are all very interesting
> and challenging. But what I'm interested in most is XML library, which is
> at the bottom of page. I think the XML format is the most popular standard
> for storing information so it's important for Boost to have a good XML
> library.
>
> However, I know Boost contains RapidXML in property_tree library to parse
> XML file. So what I want to make sure is I need to implement a new XML
> parser in this project instead of make enhancement for RapidXML. Am I
> correct? If so, I have some ideas to share with you.

I don't think your assumptions are entirely correct. First, I think the
"XML" project on the ideas page is mis-classified, as it implies a
misunderstanding. XML isn't a parser, and neither a file format - it is
in fact quite a bit more.

As I have argued many times before on this list, I think it would be
foolish to try to reimplement all the functionality to support XML.
There already are quite a few decent implementations available, written
in different languages (mostly C and Java), so it might be more
appropriate to reuse them.

I agree with others that in the context of boost this should be about
defining a good XML API, and then map that to existing libraries. In
fact, I have done that a long time ago by wrapping libxml2. You can
still see the code in the sandbox at
http://svn.boost.org/svn/boost/sandbox/xml/.

> In my opinion, an XML parser must be able to do these things:
>
> 1. To Iterate over DOM nodes tree;
> 2. To access the values of nodes and their attributes quickly;
> 3. To insert or delete nodes or attribute of an exact node easily;
> 4. To generate new XML from the structure which stores XML in the
> library.
>
> And there are some optional function too:
>
> 1. To support XPATH;
> 2. To validate whether the XML file is valid;
> 3. To support various encoding;
> 4. To manage memory better;
> 5. To support regular expression.

I agree with all of the above. Still, I think trying to reimplement this
as a "pure" boost library is the wrong approach. Focus on the API, then
map it to an existing library.

> The ideas are not mature now. I'll improve them in my proposal.
> In fact, it's not easy to implement a perfect XML parser but I'll do my
> best.
>
> And I have another question. Who will mentor this XML library project?

I would be happy to mentor this.

        Stefan

-- 
      ...ich hab' noch einen Koffer in Berlin...

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk