Boost logo

Boost :

From: Peter Simons (simons_at_[hidden])
Date: 2002-10-20 12:55:41


The Spirit Parser Framework should be ACCEPTED into Boost for the
following reasons:

1. Ease Of Use

Writing parsers with Spirit is rather simple. Any C++ programmer, who
can read the (E)BNF notation, should be able to implement a parser for
a given grammar within a couple a hours, sometimes even minutes.

The way Spirit uses operator overloading to emulate EBNF syntax in C++
is amazing. As a result, Spirit-based parsers are very expressive,
what makes them easy to understand and easy to maintain.

Admittedly, the documentation for Spirit could (and should) be more
extensive -- I remember well that I had some interesting experiences
when I started using it. But the scope of material that had to be
covered in the documentation in order to call it "complete" is
immense. Without a sound understanding of templates, expression
templates, iterators, and -- obviously -- parser theory, you cannot
expect to use the library to its full potential. But you cannot expect
a free-software author to cover all these topics in the user's manual
either.

2. Portability

Even though Spirit is built on very sophisticated mechanisms of the
ISO C++ language, it is highly portable. I have been able to compile
Spirit-based parsers without significant problems on any of the
following operating systems:

    Linux
    FreeBSD
    OpenBSD
    MacOS X
    Windows

Furthermore, I was able to compile Spirit with any of the following
compilers:

    GNU g++ 2.95, 3.0, 3.1, 3.1
    Intel C++ 6.0
    Comeau C++ 4.3.0.1

3. Reliability

After using Spirit-based parsers in real-life production code for more
than six months, I can say with some certainty that the code is
stable. I have not experienced any crashes, incorrect results, or
resource leaks.

Also, the user API has been remarkably stable through the
_significant_ development that has taken place in the past months.
Usually all I had to do to upgrade Spirit in my applications was to
execute

    cvs update && make

and that was it -- even though Joel and the gang re-wrote virtually
all the internals. The usual "let's rename all lower-case function
names to mixed-case and vice versa" routine simply didn't occur, what
strikes me as an indication of a mature interface.

4. Performance

Spirit parsers are based on expression templates, what allows for some
serious code optimization by the compiler. The drawback is that
compile times may be testing your patience, but the idle time during
compilation is invested well, because the resulting parsers are
_fast_. It might theoretically be possible to write faster parsers by
hand, but I don't see any real-life applications where Spirit's
performance should be insufficient.

5. Flexibility

5.1 Customizing The Framework

Spirit consists mostly of class templates, which make generous use of
policy classes. Thus, you can implement extensive functionality by
customizing the deployed scanner policy, or by providing appropriate
implementations of the "ForwardIterator"s, around which Spirit's
scanners are built.

This proved to be very valuable when I had to write a MIME parser, for
instance: Just by writing appropriate iterators, I had the whole
business of transport encoding (base64, quoted-printable, etc.) solved
for all remaining code. Similarly, just by plugging an appropriate
"skipper" function into the parser, I could tell the RFC822 parser
that the e-mail address

    (Ralph)peter(peti)
        . simons
     (Germany) @ acm.
     org

actually just means "peter.simons_at_[hidden]".

5.2 Legacy Code

Existing parsers can be elevated into the Spirit framework by wrapping
them into a so called "functor parser". Doing this does requires
hardly any knowledge of Spirit's internals. At least I was able to do
it just by looking at the provided example code ... So it can't be too
hard.

This feature is also useful in case you want to integrate hand-written
high-performance parsers into a complex grammar.

5.3 Dynamic Parsers

In my opinion, the most innovative property of Spirit is that you can
write dynamic parsers very easily. The possibilities, that follow from
the ability to modify or to parametrize grammars at run-time, are
mind-boggling, and I freely admit that I have not understood all
implications of this feature yet.

So far, I have used parametric parsers only once, when parsing a MIME
mail: A MIME mail may contain several independent body parts, which
are separated by a delimiter chosen by the mail's composer. With
Spirit, I was able to extract that delimiter from the "Content-Type"
header and re-use the string I found in other parts of the parser in
the very same pass -- no problem.

Apparently, this is a rather basic use of dynamic parsers: Every time
I look at this, it's yelling "SGML" at me. Implementing a complete
SGML parser with Spirit is probably _fun_.

Uh, well, probably not. But for other reasons.

6. Supportive Development Team

I would have been unable to write the things I have written with
Spirit, weren't it for the extremely friendly and helpful development
crew. In the last few months, I have repeatedly bugged them with
questions, feature requests, and weird compiler errors, and I have
always gotten a helpful reply in a matter of hours.

Most commercial support contracts suck in comparison, and I would like
to use this opportunity to say THANK YOU to Joel, Hartmut, Dan, Juan
Carlos, and everybody else who helped me out!

7. Aesthetic Implementation

This may sound like a minor issue, but I think that Spirit's source
code is very well written, formatted, and structured. Particularly
during the early days, when the documentation was "less accurate" than
it is today, I had to look through the implementation more than often,
and found it to be comparatively easy to read and to understand.

8. Miscellaneous

Other reviewers have commented that Spirit would be intertwined with
the Phoenix library that comes with it. It is true that Phoenix is
very useful when working with Spirit, and thus it is used throughout
most of the examples and test programs. But Spirit is completely
independent of Phoenix and most of the things one does with Phonix can
be accomplished with Boost.Lambda, Boost.Bind, or any other framework
as well. All Phoenix-related parts reside in their own namespace and
in their own directory, so Phoenix and Spirit are really not
intertwined at all.

All in all, if there is any reason why Spirit should NOT be accepted
into Boost, then this it is:

    I turn green with envy when _other_ people come up with
    stuff like this.

But I guess that doesn't count. :-)

        -peter


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk