Boost logo

Boost :

Subject: [boost] What about a Spirit-powered C++ syntax analysis library in Boost?
From: Florian Goujeon (florian.goujeon_at_[hidden])
Date: 2010-09-08 11:07:24


Hi Boosters,

I've written a C++ syntax analysis library using Boost.Spirit.
(This 'library' is actually a subset of the Scalpel library. I talked
about it in the Boost mailing list here:
http://article.gmane.org/gmane.comp.lib.boost.devel/208217 )
For the sake of brevity, let's call it Salsa (for Stand-ALone Syntax
Analysis).

While most C++ compilers need semantic information to perform the
syntax analysis, Salsa is a standalone syntax analyzer. Its Spirit
grammar doesn't run any semantic action.
Consequently, you can use it to parse some C++ code without having to
analyze a whole translation unit (i.e. without processing #include
directives).

At this point, you may wonder how syntax ambiguities are managed.
In most cases, there's always an interpretation which is more obvious
than the other one(s).
In all cases, you may reasonably ask the programmer to disambiguate
its code.
Whatever the case, Salsa (predictably) chooses one of the
interpretations.
Here are some examples:

The following statement…:
     a * b;
… may be either a multiplication or a pointer declaration.
The default interpretation is the pointer declaration. You can
reasonably ask the programmer to disambiguate the code by putting
parenthesis if he wants the syntax analyzer to interpret it as the
former:
     (a * b);

Trickier. In the following declaration…:
        bool bool_ = a< b || c> (d&& e);
… the right-hand side expression may be either a boolean expression
(where 'a', 'b', 'c', 'd' and 'e' are variables of type bool) or a
function template call (whose name is 'a', which takes one bool
template parameter and where 'b', 'c', 'd' and 'e' are all variables
of type const bool).
The default interpretation is the boolean expression. Once again, you
can reasonably ask the programmer to disambiguate the code by putting
parenthesis if he wants Salsa to interpret it as the latter:
        bool bool_ = a< (b || c)> (d&& e);
(Actually, I wonder why the standard allows such ambiguities.)

Note: Salsa isn't finalized yet, but it successfully parses Apache's
implementation of the C++ standard library.

I'd like to know: is there a reasonable chance that such a library
will be accepted into Boost?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk