Boost logo

Boost :

From: Larry Evans (jcampbell3_at_[hidden])
Date: 2001-05-24 10:50:40


Douglas Gregor wrote:

> On Thursday 24 May 2001 08:32, you wrote:
> > Vladimir Prus wrote:

...skipping

>
> > I'd be interested, especially in one that allowed inheritance of grammars,
> > as in antlr, and allowing the lex part of the parser being another parser.
> > I've currently got some smalltalk around somewhere that did analysis of
> > grammers to rm left recursion an some other ops on grammars. So, yes I'm
> > interested.
>
> One interface I had considered for automata was an "Acceptor" interface. The
> Acceptor concept refines the Output Iterator concept. You output symbols
> until you are finished, and then output an end-of-stream symbol. Then there
> is an "accepted" function that specifies whether or not the input string was
> in the language.
>
> Sample usage:
>
> DFA::acceptor a = myDFA.start();
> copy(istream_iterator(cin), istream_iterator(), a);
>
> if (accepted(a)) { cout << "accepted" << endl; }
>
> Automata that transform a string of tokens into another string of tokens
> (i.e., a lexer) could be created as an output iterator adaptor:
>
> HighParser::acceptor h = myHighParser.start();
> TokenParser::acceptor t = tokenParser.start();
> Lexer::acceptor l = lexer.start();
>
> copy(istream_iterator(cin), istream_iterator(), l(t(h)));
> finalize(l); // Send the end-of-stream token through
>
> if (h) { cout << "accepted" << endl; }
>
> The idea is that any of these transforming acceptors are a function object
> that takes another acceptor and returns an acceptor that emits tokens to the
> acceptor it contains. So whenever "l" matches a token it outputs that token
> to "t" and, likewise, when "t" matches a token it outputs it to "h".
>

This maybe close to what I want. To make it clearer, let me refer you to
the attached pccts grammar, cflowCmmPmsZomSafe.g. I would like to
replace the TOK_SLC and TOK_PRED instances of the input language
with the output from a parser which parsed Straight Line Code
and PREDicate expressions. The reason why I want to do this is to
test a control flow normalizer with a simple language which just contained
the syntax needed for the normalizer (after all, expressions and straight line
code do not affect the control flow). After the normalizer is verified with
the simple language, I can test it on the full language by replacing the
TOK_SLC and TOK_PRED with some non-terminals.

I hope that's clear.


//ChangeLog:
// 2000-10-22.1100
// WHAT:
// copied from ../cflowCRC/cflowCRC.g
// WHY:
// To test use cmmPms_zomSafe garbage collection
#header
<<
#include "headers_ast.hpp"
>>
<<
#include "mainBody.hpp"
  int
main(int argc,char* argv[])
  {
        ; int rc=mainBody(argc,argv)
  ; return rc
  ;}
>>
//Special End of file token:
#token TOK_EOF "@"
#token "[\n]" <<skip();newline();>> //skip newline
#token "[\ \t]+" <<skip();>> //skip whitespace
#token TOK_IF "if"
#token TOK_ENDIF "endif"
#token TOK_SLC "slc[0-9]+"
#token TOK_PRED "pred[0-9]+"
#token TOK_ENDSTMT ";"

class cflowCmmPmsZomSafeParser
  {
        <<
        public:
           static const unsigned ConstLLK=1;
>>
    start_parse > [cflow_root a_lhs]
      : << ProxMakeBtm<cflow_ast,SubjTop_ast> make_lhs
                          ; $a_lhs = make_lhs
                                ;>>

                                <<stmtList_root a_rhs
                                ;>>
                                stmtList_parse >[a_rhs]
                                <<$a_lhs->put_list(a_rhs)
                                ;>>
      ;

                stmtList_parse >[stmtList_root a_lhs]
      : << ProxMakeBtm<stmtList_ast, SubjTop_ast> make_lhs
                          ; $a_lhs = make_lhs
                                ;>>

                                (
                                  <<absStmt_root a_rhs
                                  ;>>
                                  absStmt_parse >[a_rhs]
                                  <<$a_lhs->push_back(a_rhs)
                                  ;>>
                                )*
                  ;
                absStmt_parse >[absStmt_root a_lhs]
      :
                                <<ifStmt_root a_rhs
                                ;>>
                                ifStmt_parse >[a_rhs]
                                <<$a_lhs = a_rhs
                                ;>>
      |
                                <<slcStmt_root a_rhs
                                ;>>
                                slcStmt_parse >[a_rhs]
                                <<$a_lhs = a_rhs
                                ;>>
                        ;

    ifStmt_parse >[ifStmt_root a_lhs]
      : << ProxMakeBtm<ifStmt_ast, SubjTop_ast> make_lhs
                          ; $a_lhs = make_lhs
                          ; stmtList_root a_stmtList
                                ;>>
        TOK_IF p:TOK_PRED
        stmtList_parse >[a_stmtList]
                                TOK_ENDIF
                                <<$a_lhs->put_pred($p->getText())
                                ; $a_lhs->put_body(a_stmtList)
                                ;>>
      ;

    slcStmt_parse >[slcStmt_root a_lhs]
      : << ProxMakeBtm<slcStmt_ast, SubjTop_ast> make_lhs
                          ; $a_lhs = make_lhs
                                ;>>
                          slc:TOK_SLC
                                <<$a_lhs->put_slc($slc->getText())
                                ;>>
      ;

  }//end cflowCRCParser class


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk