Boost logo

Boost :

From: Damien Fisher (dfisher_at_[hidden])
Date: 2001-09-26 07:31:55


----- Original Message -----
From: <williamkempf_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Tuesday, September 25, 2001 11:01 PM
Subject: [boost] Re: Regression results as XML?

> --- In boost_at_y..., Marshall Clow <marshall_at_i...> wrote:
> > At 12:33 PM +1000 9/25/01, Fisher; Damien Kaine wrote:
> > >On Mon, 24 Sep 2001, Daryle Walker wrote:
> > >
> > >> on 9/24/01 7:29 PM, Jens Maurer at Jens.Maurer_at_g... wrote:
> > >>
> > >> > Damien Fisher wrote:
> > >> >>
> > >> >> Well...why isn't there an XML parser in boost?
> > >> >
> > >> > Because nobody has written one and offered it for inclusion
> > >> > into boost. Feel free to do so.
> > >>
> > >> I was just thinking about this right before reading this
> message. I heard
> > >> that XML was supposed to be designed so a CS student could write
> a parser
> > >> within a week. (I guess this means just enough to check for
> well-formed
> > >> files, and not validation nor all the whiz-bang stuff like XSL,
> XPath,
> > >> Schema, XQuery, etc.)
> > >>
> > >> Let me think about this a little more....
> > >
> > >I disagree.
> > >
> > >Parsing a simple XML document is easy, and your right, a CS
> student could
> > >do it.
> > >
> > >But, as with all W3C specs, doing the whole thing is a little bit
> > >messier. Maybe a 3rd year CS student's level :).
> > >
> > >The main problem is not that the coding is difficult - it is not.
> But I
> > >have never come across an XML parser whose interface I was really
> happy
> > >with. They are all a little messy.
> > >
> > >The other thing is the "extra" stuff production XML parsers really
> have to
> > >include - XSLT transforms, etc. And these have to be pretty high
> > >performance.
> > >
> > >I have done a lot of work with this stuff and I would be quite
> interested
> > >in contributing to this if it goes ahead.
> >
> > I would be interested as well.
> >
> > Let's start with a description of the interface that people would
> like to see.
> > Let's nail that down first.
> >
> > P.S. After saying that, I have to point out that I will be
> unavailable from
> > now until next Monday. ;-) I'll be happy to participate then.
>
> No matter how ugly you think the interface is, any successful parser
> is going to have to include the interfaces for the two current
> standard interfaces: DOM and SAX. Including a third interface that's
> easier to deal with for simple chores may be a great idea, but we
> really do need the standards.

Totally agree. Implementing DOM and SAX isn't difficult (just tedious) and
it will definitely reduce any reluctance to learn about/use the library.

However, I really think that DOM and SAX's most severe flaw is also their
main advantage - their relative language independence. A third interface
which would allow painless (relatively speaking :) ) integration with other
standard C++ libraries is a must. This would mean including simple things,
like standards-conformant iterators of various types over parsed XML
documents. Once we have this in place, DOM is trivial to implement. SAX
can be implemented as a side-effect of the general XML parsing framework,
and is pretty much a piece of cake.

Thankfully, XML is a conceptually very simple standard. I think that once
several nagging issues are out of the way (unicode support, the various
encodings supported, etc) the actual implementation can be done very
quickly. Implementing all the related standards is a drag, but also doable,
although performance is definitely a key goal in any implementation of
stylesheets, etc.

Damien


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk