Boost logo

Boost :

From: Dan Nuffer (dnuffer_at_[hidden])
Date: 2001-09-29 17:17:00


Hi,

Regarding XML, I've had quite a bit of experience with XML. I've
written DTD's, used both SAX and DOM (Xerces), and have written two XML
parsers (not complete, but functional enough).

I'm willing to help out with any effort to create a Boost XML parser. I
like Dietmar's suggestion for a layered tokenizing iterator approach.

A few notes about the two parsers I've worked on:
The first is part of OpenWBEM. It is a pull-based parser. It only does
the minimum to parse XML, so no DTD processing or anything fancy, it
just handles elements, and makes sure begin/end tag pairs match.
The relevant files are src/xml/OW_XMLParserSax.hpp and
src/xml/OW_XMLParserSax.cpp, which can be seen at the following URLs:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/openwbem/openwbem/src/xml/OW_XMLParserSax.hpp?rev=1.5&content-type=text/vnd.viewcvs-markup

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/openwbem/openwbem/src/xml/OW_XMLParserSax.cpp?rev=1.5&content-type=text/vnd.viewcvs-markup

This parser was written to replace Xerces, and our previous code was
using a SAX interface, so that is implemented as well, it's basically:
while (have more XML elements)
{
    get the next element;
    call the appropriate SAX interface function;
}

The other parser was written as an example for the spirit parser
library. The parser implementes the *complete* XML grammar, but doesn't
check any semantics, it can only determine if the input matches the
grammar or not. It will be relatively easy to add extra functionality
and checking to the parser, thanks to spirit.

Here is the url:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/spirit/spirit/libs/example/xml/xml.cpp?rev=1.4&content-type=text/vnd.viewcvs-markup

Both are open source, so are usable as a starting point for a more
complete parser, if anyone is interested.

Irregardless of what people think of these parsers, I am interested in
helping out.

--Dan Nuffer


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk