|
Boost : |
From: Bruce.Florman_at_[hidden]
Date: 2001-06-01 11:10:24
I guess it's about time I started replying in public.
--- In boost_at_y..., "joel de guzman" <isis-tech_at_m...> wrote:
> Syntax:
>
> I've been tinkering around Spirit's semantic
> action syntax. I've come up with this:
>
> a[action]
>
> The advantage is that we can add facilities into the
> semantic action syntax later (e.g. add attributes or other
> lazy functional tricks or a complete expression template
> semantic action framework) and still maintain a clear
> separation of concerns (semantics related elements and
> possibly expressions are clearly separated inside the braces.
>
> (I'm glad we did not commit the [] brackets for iteration :-)
Um ... okay. The index operator needs to be a member function
though, right? So this is going to be a member of the Parser
class template, or something else? If it's Parser, then how will
this interact with the directive classes, which are themselves
derived from Parser?
Just to bring the rest of the group up to speed: I'd told Joel
via email that I didn't particularly like the syntax he'd used
for specifying semantic actions -- i.e. action(pattern) -- and
I'd proposed (and implemented) an alternative syntax that put the
action to the right of the grammar fragment whose recognition
triggers it. At the time there were discussions here regarding
alternatives to using the right shift operator for sequencing,
and to get the "correct" precedence, I'd proposed using % for
sequencing and >> for semantic actions. Using this syntax, the
simple grammars in the test programs that are currently packaged
with Spirit could be expressed like this:
Rule expr; // forward
Rule integer = Lexeme[ !(plus | minus) % +digit ] >> doInt;
Rule group = lparen % expr % rparen;
Rule expr1 = integer | group;
Rule expr2 = expr1 % *( times % expr1 >> doMult
| divide % expr1 >> doDiv
);
expr = expr2 % *( plus % expr2 >> doAdd
| minus % expr2 >> doSubt
);
Further, I'd proposed to him that the Filter<...> class (i.e. the
parser type produced by the shift operator here) not hold a reference
to the action, but rather aggregate the action directly. This would
allow the action type's constructor to be invoked directly in the
grammar so that a single action class could be reused for multiple
distinct but related actions. For example if we had a Spirit::Action-
derived class named PushOp whose ctor took an operator and a reference
to a stack, then the expr2 rule above might be written like so:
Rule expr2 = expr1 % *( times % expr1 >> PushOp(OP_MUL, stack)
| divide % expr1 >> PushOp(OP_DIV, stack)
);
However, Joel pointed out that there are times when you really want
the action to be held by reference so that the same object can be
referenced from more than one point in the grammar and can thus serve
as a channel for information between the points in the parse where
those two grammar fragments have been recognized (I'm paraphrasing
here, so he might correct me if I'm misrepresenting his position).
I'd then suggested that maybe we need to have two different kinds of
actions to support both uses, and now that I've looked at and thought
about the situation a little longer, I'd like to propose something
further along those lines. I notice that the protocol Joel has
established between Filter objects (BTW, I don't much care for the
name "Filter" here, but I don't have any counterproposals at the
moment) and Action objects -- i.e. the Filter calls a member function
named Process -- is not used anywhere else in the framework. I'd like
to suggest the following changes:
1. The Filter should use operator() rather than a member function
named Process to invoke the semantic action, and it should NOT
require the action to be an object derived from the Spirit::Action
class. It just needs to be something to which the function call
operator may be applied (with the correct arguments, of course).
2. The syntax for embedding semantic actions, whatever ends up
being used, should be overloaded (via partial or explicit
specialization) so that the Filter holds Spirit::Action-derived
objects by reference, and everything else by value.
This way a semantic action can be a reference to a non-const Action
object, just as it is now; it can be a const functor of some
completely unrelated type; or it can be nothing more complicated
than a function pointer.
> Back tracking:
>
> As was noted by Bruce Florman and originally by Larry
> Evans. There's a problem with semantic actions and back-
> tracking. This happens when semantic actions are invoked
> before backtracking occurs. Additional code must be added
> to 'uncommit' a triggered semantic action of a failed parse.
>
> The solution, it would seem, is to invoke all actions in an
> alternative (or any expression that backtracks) after parsing;
> such that the successful branch is already known. I am currently
> in search of a viable and seamless implementation along these lines.
As I'd noted privately to Joel, the Match objects (which are returned
by all the Parse function) are already used to carry some information
up from the lower-level constructs to the higher-level ones -- i.e.
the length of the matched input. I suggested that the Match objects
might serve as a container for parse state information. The semantic
actions would not alter global state, but would instead alter the
state of the match object. When the parse of a higher-level
construct fails, the match object representing the output from the
lower-level semantic actions would simply be destroyed -- effectively
uncommiting those actions.
When parsing the input, the state would need to be "cloned" prior to
attempting to match any alternatives (or other constructs to which we
might backtrack), suggesting that the parse state needs a reasonably
lightweight representation. The fact that Spirit supports A&B as well
as A|B implies that a mechanism for merging states would be needed
too.
I haven't really thought through all the ways that "state" might be
represented. One way to represent the parse state is to build an AST
obviously, but there are just as obviously other ways to represent it,
and I don't think we want Spirit to explicitly impose any particular
one. However, providing the primitives for building an AST in a form
that is easily used within the Spirit framework, but is not required
to be used, seems like it might be a good approach.
I also haven't thought through exactly how to go about specifying the
representation type for the Match class, or whether the Match class is
really the best place to hang the state information at all. Maybe it
would be better to have a separate state object that gets passed
between all the Parse functions and semantic actions.
> I would appreciate comments and feedback very much.
Yeah, you and me both. :-)
--Bruce Florman
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk