Boost logo

Boost :

From: Reece Dunn (msclrhd_at_[hidden])
Date: 2003-05-30 10:59:23


Stefan Seefeld wrote:

>Vladimir Prus wrote:
>>>What is wrong with Xerces-C++ library
>>>(http://xml.apache.org/xerces-c/index.html) ?

>>Probably, the fact that its tarball is comparable in size to the entire
>>Boost?

>And, related, performance of the libxml2 is far better then any
>compotitor. Of course, it remains to be shown that my C++ wrapper
>doesn't destroy that :-)

I am using my own wrappers around the Microsoft XML4 COM objects, but this
pulls in a lot of my own MS-specific code, so it would not be suitable for
Boost.

There are also many other implementations available that offer different
facilities and come with their own advantages and disadvantages. Different
people will want different things out of an XML library, and the nature of
the XML library may vary depending on the operating system the user has
installed.

>>Another thing is that it's not a big friend of C++ standard library.
>>For example, it does not use std::string, but its own XMLString class.

The Microsoft interface use wide-character strings that use specialist APIs.
By wrappers around this (com::bstring) have std::string and std::wstring
conversions, as well as having I/O stream support.

>indeed. As I said, 'the' DOM and SAX APIs are pretty 'Java-like'. There
>is no standardized C++ API for it. While one can 'naturally' map the
>interfaces to C++, but that doesn't take C++ idioms and practices into
>account.

I have tried to make by library as intuitive as possible, but as I have
said, it is very MS-specific, with com::bstring creation from a VARIANT, the
use of properties to make get/put usage simple, generation of com::hresult
exceptions and operator[] on com::msxml::XMLDOMNode types as an alias for
selectSingleNode to extract a single node via an XPath expression.

Writing an XML parser from scratch for boost should, IMHO, have these
features:

[1] It should make use of the Spirit and Regex libraries for XML and XPath
parsing.
[2] It should conform to the following W3C standards:
   (a) XML 1.0 SE while looking at supporting XML 1.1 in the future;
   (b) DOM 1.0/2.0/3.0
   (c) XPath 1.0 with XPath 2.0 support in the future (switch between
them??)
   (d) Unicode 3.x support (I know this is not a W3C standard, but it is
related)
while these standards would be useful to have support for:
   (e) XSLT 1.0 with XSLT 2.0 support in the future (select which to use?)
   (f) XMLSchema 1.0
these standards are optional, but should be implementable using the base
standards:
   (g) XSL:FO 1.0
   (h) MathML 1.0/2.0
   (i) SVG 1.x/2.x - Scalable Vector Graphics

NOTE: The library should be capable of evolving as the web standards evolve.

[3] It should have W3C DOM bindings for the XML document in a way that
allows support for adding extensions to this, e.g. providing a MathML or SVG
DOM on top of the XML DOM.

[4] It should provide XPath bindings to the XML DOM in a clean way; I
personally like the MS selectSingleNode/selectNodes extension to the XML
node DOM interface.

[5] It should have a clean access to attributes, without the user needing
to call get/set methods.

I am aware that something like this would be a huge undertaking, and that
there are many tools and libraries that support these or some of these. This
is just a wishlist for a library that I would be interested in using.

Regards,
Reece

_________________________________________________________________
Get Hotmail on your mobile phone http://www.msn.co.uk/msnmobile


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk