Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2006-04-23 12:15:58


David Abrahams wrote:
> "Tom Brinkman" <reportbase_at_[hidden]> writes:
>
>> I suspect that those of you who want the parsers included with this
>> library are just trying to sneek an XML parser through the review
>> process without a full review. Please tell me that I'm wrong on
>> this point.
>
> I think it's best if we all operate as though our fellow Boosters are
> acting honorably and with the best intentions. I don't see how a
> suggestion that someone is trying to "sneek" a component through
> without full review can possibly be productive.

FWIW - When making a library beyond a certain size, its almost impossible
to avoid "sneeking" (sic) some other new components along besides.
Once something is needed, and you don't find it, you make your own. But
given that you're already setup with testing, documentation, and yo want to
make this new thing orthogonal to the central issue you're interested it,
you end up with a new additional library. This happened to me and
I ended up with extended_type_info, dataflow iterators, STRONG_TYPE
and maybe others. These never got formally reviewed separately maybe
they should have been or should be. But I do know that at least on occasion
some of them have been used separately. So I could be considered
of "sneeking"(sic) unreviewed components into boost. I will in fact confess
that I sort of guilty of doing this intentionally. I packaged them as
separate "libraries" but included them as part of the serialization libarry.
There is no way I could find time to handle 3 or 4 more separate reviews.
I was surprised that during the review no one really examined or criticised
these components on their own. I don't have a real point here, I just
wanted to add a little perspective on what really happens in cases like
this.

XML parser. - A ripe subject. I was never enthralled with XML and
particularly for serialization. I felt I had to do this in order to get the
serialization library accepted. And it did provide a torture test to
double check that the archive concept was sufficiently defined to
support even the most baroque formats. I believe that "requirement"
that xml be supported was due to the misperception that an XML
serialized data structure would be editable in some general way.

Anyway, when I looked at XML, it was worse than I thought. It turns
out that there are many, many ways that data could be written which
would be compatible with XML. (I don't even think that name tags
are strictly required). So I selected the simplest version that I thought
would be acceptable and implemented using the spirit library. The
spirit library included two XML grammars - I believed I used the simplest
one. I'm pleased to say that, after some initial learning curve issues and
after adding things for certain types of escapes - (not quite done here)
this has worked out very well. The same grammar compiles with
the most recent version of spirit as well as the older 1.6 version which
is compatible with older compilers. Honestly, the advantages of
maintainability, robustness, testing, quality, completeness, etc are such
that I don't believe that any XML parsing writting "by hand" is going
to be anything but inferior. Using spirit and Dan Nuffer's XML grammer
is the only way to go. If this grammar needs to be refined/improved or
made complete so be it. At least we would have a continually improving
XML parsing solution that starts out high quality and improves from
there. This would be much better than having a number of different
XML parsers floating around.

Given the avaliabilty of spirite, Dan Nuffer's XML grammar, and
probably some sort of tree structure - adobe, multi-index, or
roll your own from STL, I'm very surprised no one has submitted
a DOM and/or SAX xml parser. It seems to me that this would be
a straight forward composition of these three high quality components.

It is surprising to me that given all the strong views expressed on the
importance and utility of XML, and the fact that there are lots of ways
to do it, and the fact that the serialization libray version doesn't support
editing and doesn't permit the input to be re-ordered, that no one
that I know of has submitted a variation/improvement on the xml_archive
included. I suspect that the limitiations of XML for serialization become
more apparent when one has more experience with the library and
that the "extra" features aren't worth the effort required to make
another archive.

I'm also curious to know if anyone has had good success browsing
the XML archive with any sort of tools. I think I tried the one
that came with my VC 7.1 system but I wasn't left with a positive
impression - I forget why now.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk