Boost logo

Boost :

From: Graham Bennett (graham-boost_at_[hidden])
Date: 2005-11-10 20:13:20


On Wed, Nov 09, 2005 at 09:21:45PM -0500, Stefan Seefeld wrote:
> Graham Bennett wrote:
> > Hi Doug,
> >
> > On Tue, Nov 08, 2005 at 09:28:09PM -0500, Douglas Gregor wrote:
>
> >>Readers are important for some things, DOM is important for other
> >>things, but there's no reason to tie the two together in one library
> >>or predicate one on the other.
> >
> >
> > Well, there is at least one reason - if the DOM is built on top of a
> > reader interface then the DOM library doesn't have to know how to
> > parse XML, and is not tied to any particular parser. Even if you
> > don't agree with using a reader interface for this separation layer,
> > I'd hope you would agree that some separation is at least necessary.
>
> I wish people would stop being so parser-focussed. I reiterate: the
> API I suggest is about manipulating a DOM tree. The fact that you
> *might* want to construct it from an XML file by means of a parser is
> almost coincidental.

I agree that the way the DOM is created doesn't really have anything to
do with a parser or anything else. It's perfectly possible to put the
DOM together any way you want. I think people have expressed concern
that the intention might be to ship the library with a libxml2 (or any
specific parser) implementation for building the DOM from text, which I
don't think would be a good idea. I was suggesting that having a way to
build the DOM from a standardised interface, like a reader, would be a
way to separate these concerns.
 
> Yes indeed, an implementation of such an XML parser will most likely
> use either a SAX or an XmlReader layer beneath, and in fact, libxml2
> does exactly that and it would be quite natural to expose those APIs
>> to C++ in a similar way I propose the DOM wrapper.

Ok, I agree.
 
> >>We can have a XML DOM library that allows reading, traversing,
> >>modifying, and writing XML documents, then later turn the reading
> >>part into a full-fledged streaming interface for those
> >>applications.
> >
> >
> > Can you elaborate on how you would enable a DOM structure to present
> > a streaming interface?
>
> Not the DOM structure, but the parser ! It's exactly what you are
> saying above: Each sensible XML parser will use an API underneath that
> can be used to build a public SAX or XmlReader (or both) on top of.
>
> But instead of requiring the parser to be built on such a C++ API I
> use a C implenentation that already contains multiple APIs, and I wrap
> them *separately* into C++ APIs. For a user of the C++ DOM API it is
> totally irrelevant whether the implementation is based on the C++ SAX
> API or an internal C SAX API, as long as it adhers to the
> specification.
>
>
> > Are you talking about lazy tree building or something else? In any
> > case, I would think it's inherantly difficult to retrofit a
> > streaming interface. Much better to build the streaming interface
> > from the start, and build the DOM on top of it. This can only be
> > good for both sides - the reader gets to just be a reader, and the
> > DOM gets to just be a DOM.
>
> You haven't talked about the DOM yet, only about a parser.

I think I wasn't clear in my previous mail. I'm not at all concerned
with parsers, there are plenty of them and they do a good job. I'm not
suggesting a parser should be implemented. The only thing I am
concerned about is that Boost define a standard streaming XML reader
API. That is where I think there is a distinct need in C++ at the
moment.

> You still need to provide all the other missing bits, such as an XPath
> lookup mechanism, XInclude processing, http support for URI lookup,
> etc., etc. I can't stress it enough: the parser is really just a tiny
> bit of it all.

Agreed that the parser is a small part, but so is the DOM. All of the
things you mention above can and should be implemented independently of
a DOM model, IMO.

Please don't think that I'm against a Boost DOM implementation, I think
it's a worthy effort and what you have submitted is a good start. I
just think that a standardised reader interface is a much more important
integration point than DOM, and I'm suggesting that it would be
worthwhile putting effort into that area sooner rather than later.

cheers,

Graham

-- 
Graham Bennett

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk