Boost logo

Boost :

From: Anthony Williams (anthony.williamsNOSPAM_at_[hidden])
Date: 2003-06-02 03:49:16


"Reece Dunn" <msclrhd_at_[hidden]> writes:
> Writing an XML parser from scratch for boost should, IMHO, have these
> features:

> [1] It should make use of the Spirit and Regex libraries for XML and XPath
> parsing.

Whilst these libraries might be useful for the parser writer, I don't see any
benefit to requiring their use for a boost XML parser. If a submitted parser
used alternative parsing methods that should be acceptable provided it worked.

> [2] It should conform to the following W3C standards:
> (a) XML 1.0 SE while looking at supporting XML 1.1 in the future;

Yes

> (b) DOM 1.0/2.0/3.0

Hmm. The DOM standards in particular are very Java oriented, and don't
necessarily make for efficient C++ bindings. I can see that the parser needs
to provide the same set of facilities though, even if it is done in a
different way.

> (c) XPath 1.0 with XPath 2.0 support in the future (switch between
> them??)

Yes.

> (d) Unicode 3.x support (I know this is not a W3C standard, but it is
> related)

Yes.

> while these standards would be useful to have support for:
> (e) XSLT 1.0 with XSLT 2.0 support in the future (select which to use?)

Yes

> (f) XMLSchema 1.0

Yes

> these standards are optional, but should be implementable using the base
> standards:

> (g) XSL:FO 1.0
> (h) MathML 1.0/2.0
> (i) SVG 1.x/2.x - Scalable Vector Graphics

Agreed to all 3.

IMHO, the base parser should provide an API on which other things can be
built. For example, provided the facilities are present to retrieve the
information needed for XPath processing, the core API doesn't need to have an
XPath processor. Likewise for XSLT.

Actually, the same goes for just about everything, including DTD and Schema
validation --- provided the parser makes all the required information
available then the validation scheme can be provided on top of the core API.

However, I think it is important that the library does include add-on APIs for
as much of the supporting standards as possible, such as DOM-like processing,
XPath node selection, DTD and XMLSchema validation, and XSLT.

> NOTE: The library should be capable of evolving as the web standards evolve.

Of course.

> [3] It should have W3C DOM bindings for the XML document in a way that
> allows support for adding extensions to this, e.g. providing a MathML or SVG
> DOM on top of the XML DOM.

If the DOM API is built on top of the core API, this really comes naturally,
as you can write a new DOM API for specific purposes.

> [4] It should provide XPath bindings to the XML DOM in a clean way; I
> personally like the MS selectSingleNode/selectNodes extension to the XML
> node DOM interface.

There is no point in providing XPath support if it's painful to use.

> [5] It should have a clean access to attributes, without the user needing to
> call get/set methods.

I am not sure what you mean here.

> I am aware that something like this would be a huge undertaking, and that
> there are many tools and libraries that support these or some of these. This
> is just a wishlist for a library that I would be interested in using.

I am developing Axemill (http://www.sf.net/projects/axemill) to fulfil most of
these goals, with the eventual goal of submitting it to boost. If you want to
contribute code and/or ideas, please email me. Currently, it requires gcc 3.2
(though it should build with other relatively conforming compilers) and boost
1.29.0 (I intend to move to 1.30.0 shortly)

Anthony

-- 
Anthony Williams
Senior Software Engineer, Beran Instruments Ltd.
Remove NOSPAM when replying, for timely response.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk