Boost logo

Boost :

From: Graham Bennett (graham-boost_at_[hidden])
Date: 2005-11-10 20:13:20

On Wed, Nov 09, 2005 at 09:21:45PM -0500, Stefan Seefeld wrote:
> Graham Bennett wrote:
> > Hi Doug,
> >
> > On Tue, Nov 08, 2005 at 09:28:09PM -0500, Douglas Gregor wrote:
> >>Readers are important for some things, DOM is important for other
> >>things, but there's no reason to tie the two together in one library
> >>or predicate one on the other.
> >
> >
> > Well, there is at least one reason - if the DOM is built on top of a
> > reader interface then the DOM library doesn't have to know how to
> > parse XML, and is not tied to any particular parser. Even if you
> > don't agree with using a reader interface for this separation layer,
> > I'd hope you would agree that some separation is at least necessary.
> I wish people would stop being so parser-focussed. I reiterate: the
> API I suggest is about manipulating a DOM tree. The fact that you
> *might* want to construct it from an XML file by means of a parser is
> almost coincidental.

I agree that the way the DOM is created doesn't really have anything to
do with a parser or anything else. It's perfectly possible to put the
DOM together any way you want. I think people have expressed concern
that the intention might be to ship the library with a libxml2 (or any
specific parser) implementation for building the DOM from text, which I
don't think would be a good idea. I was suggesting that having a way to
build the DOM from a standardised interface, like a reader, would be a
way to separate these concerns.
> Yes indeed, an implementation of such an XML parser will most likely
> use either a SAX or an XmlReader layer beneath, and in fact, libxml2
> does exactly that and it would be quite natural to expose those APIs
>> to C++ in a similar way I propose the DOM wrapper.

Ok, I agree.
> >>We can have a XML DOM library that allows reading, traversing,
> >>modifying, and writing XML documents, then later turn the reading
> >>part into a full-fledged streaming interface for those
> >>applications.
> >
> >
> > Can you elaborate on how you would enable a DOM structure to present
> > a streaming interface?
> Not the DOM structure, but the parser ! It's exactly what you are
> saying above: Each sensible XML parser will use an API underneath that
> can be used to build a public SAX or XmlReader (or both) on top of.
> But instead of requiring the parser to be built on such a C++ API I
> use a C implenentation that already contains multiple APIs, and I wrap
> them *separately* into C++ APIs. For a user of the C++ DOM API it is
> totally irrelevant whether the implementation is based on the C++ SAX
> API or an internal C SAX API, as long as it adhers to the
> specification.
> > Are you talking about lazy tree building or something else? In any
> > case, I would think it's inherantly difficult to retrofit a
> > streaming interface. Much better to build the streaming interface
> > from the start, and build the DOM on top of it. This can only be
> > good for both sides - the reader gets to just be a reader, and the
> > DOM gets to just be a DOM.
> You haven't talked about the DOM yet, only about a parser.

I think I wasn't clear in my previous mail. I'm not at all concerned
with parsers, there are plenty of them and they do a good job. I'm not
suggesting a parser should be implemented. The only thing I am
concerned about is that Boost define a standard streaming XML reader
API. That is where I think there is a distinct need in C++ at the

> You still need to provide all the other missing bits, such as an XPath
> lookup mechanism, XInclude processing, http support for URI lookup,
> etc., etc. I can't stress it enough: the parser is really just a tiny
> bit of it all.

Agreed that the parser is a small part, but so is the DOM. All of the
things you mention above can and should be implemented independently of
a DOM model, IMO.

Please don't think that I'm against a Boost DOM implementation, I think
it's a worthy effort and what you have submitted is a good start. I
just think that a standardised reader interface is a much more important
integration point than DOM, and I'm suggesting that it would be
worthwhile putting effort into that area sooner rather than later.



Graham Bennett

Boost list run by bdawes at, gregod at, cpdaniel at, john at