Boost logo

Boost :

From: Daniel Walker (daniel.j.walker_at_[hidden])
Date: 2006-04-23 16:29:46


On 4/23/06, Thorsten Ottosen <thorsten.ottosen_at_[hidden]> wrote:
> Daniel Walker wrote:
> > On 4/22/06, Joel de Guzman <joel_at_[hidden]> wrote:
> >
> >>Thorsten Ottosen wrote:
> >>
>
> >>>I have not yet understood why xml needs to be so sophisticated, and will
> >>>probably continue to ignore all those wierd an advanced xml-features.
>
> >>I agree. I have the same observation. Most practical uses of XML
> >>are actually very simple. I too do not understand why XML needs
> >>to be so sophisticated.
> >
> >
> > All the significant projects I know of that use XML tend to use
> > namespaces and schemas, for example Mozilla, Gnome, OpenOffice.
> > Namespaces are useful in XML for the same reason their useful in C++:
> > modularity, which is good if you're dealing with a project maintained
> > by more than one author with shared components that encode data as
> > XML.
>
>
> > Schemas give you data types and type checking, which obviously is
> > nice to have when you're dealing with data. I think XML schema
> > validation is one of the most import features of XML for the same
> > reason that I like C++ templates and type-safe compile time
> > polymorphism: making sure your data types are correct before hand
> > gives you one less thing to worry about.
>
> Why is that better than a run-time exception when loading the file?

Why is what better? Maybe I wasn't clear. When an XML file includes a
schema and fails validation when loaded, you do get a run-time
exception. I was trying to say that's a good thing. An XML validating
parser is similar to a compiler for a strongly typed language: it
catches type errors (in addition to syntax errors) immediately before
you actually try to use the file.

> > For anyone interested in becoming convinced of the usefulness of these
> > XML features I would suggest the tutorials at
> > http://www.w3schools.com.
>
> There's like 17-18 tutorials on XML. I rest my case.
>
> > I don't think this has any repercussions for property_tree other than
> > to recognize that for the initial release it won't scale beyond
> > trivial application configurations.
>
> I think it is important that it never scales beyond simply
> things.
>
> > That may be fine to begin with,
> > but at some point Boost users may have higher expectations. We're
> > spoiled rotten by Boost.Regex and others.
>
> That's where a full xml-library comes in handy.

I agree. I'm just saying that when people looking for XML tools see
the words "XML," "parser" and "tree" in a description of a Boost
library, it may give a certain initial impression or raise
expectations. "Boost" has become synonymous for "high quality" for a
lot of C++ programmers. So, some of them may think this library
includes a "high quality XML parser" that generates "high quality
trees." Upon closer examination, potential users will find that
property_tree does nothing of the sort, but it does something else
well.

However, property_tree's users would certainly benefit from a real W3C
standard XML parser. Perhaps, their simple projects will grow into
more complicated projects over the years, and additional standard XML
features would help them manage the complexity. Of course, additional
features means additional learning curve, which means additional
tutorials may help them along. Even if the current XML (or is it
really SGML) parser included with property_tree never matures into a
W3C standard XML parser, I wouldn't rule-out the possibility that one
day Boost users may want property_tree or program_options or some
other Boost program configuration system to let them validate their
XML config files, which are after all human editable, susceptible to
human errors, and in need of validation. Pardon the rant.

Actually, is the parser just an SGML parser? Having just argued for
more features, maybe it's actually better to have less. If really all
you ever want for property_tree is a simple mark-up format for config
files, maybe all this confusion could be avoided by renaming XML to
SGML (without DTD support). XML is a subset of SGML in the sense that
SGML is less restrictive, so an SGML parser can accept XML files. SGML
(ISO 8879) may be a better mark-up standard for property_tree, or at
least more suited to property_tree's current intended use-cases and
feature set.

Daniel Walker


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk