Boost logo

Boost Users :

Subject: Re: [Boost-users] Spirit Newbie (balanced parentheses)
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2010-04-16 21:07:26


On Fri, Apr 16, 2010 at 11:41 AM, EricB <eric.britz_at_[hidden]> wrote:
> Hi all
>
> I'm trying to use spirit in order to make a parser for a simple language.
> Last time i had to do this I was in school (some more than 15 years ago).
> I managed to do  something that works but it does not handle "balanced
> parentheses".
> I know this is a common problem but after having look for every possible
> way of spelling "balanced parentheses" I did not find anything (that I
> could understand).
>
> The files I want to parse are made of ASCII char. They can contains
> commands that should be interpreted (replacement).
> Commands always start with "^!" (command switch).
>
> An example of a file content:
>
>        text text text ^!for-each( A )( B ) text text text text
>
> I want to send the "text" strings to the std::cout and detect the
> commands "^!for-each( A )( B )" in order to process them.
> At the moment I only have one command "for-each"
> Where:
> A : is a query that can contain /'" and balanced ()
> B : is the text to be sent to output for each result of the query. This
> can be a 'script' meaning text + commands, for example:
>        aa bb ^!for-each(C)(D) tt yy uu
> therefore B can also contain balanced ().
>
> The rules, I wrote does not handled balanced () very well.
>
> For example the following command should be parsed successfully:
>
> ^!for-each( a/b/c()/e[]/d )
> (
> item [name] do ^!for-each(sub/text()) ( print() )
> )
>
> I need some advices. Thank you
>
> So here are my rules
>      {
>          boost::spirit::chlit<>     LPAREN('(');
>          boost::spirit::chlit<>     RPAREN(')');
>          boost::spirit::strlit<>     CMDSWITCH("^!");
>
>          script // main rule
>            = * ( (boost::spirit::anychar_p - CMDSWITCH)
>              | command
>              );
>          command
>            = boost::spirit::discard_first_node_d[
>              CMDSWITCH
>              >> (for_each
>                | boost::spirit::eps_p // for error reporting
>              )
>            ];
>          for_each
>            = boost::spirit::discard_first_node_d[
>              boost::spirit::as_lower_d["for-each"]
>                >> *boost::spirit::space_p
>                >> query
>                >> *boost::spirit::space_p
>                >> subscript
>            ];
>          query
>            = boost::spirit::inner_node_d[
>              LPAREN >> *(boost::spirit::anychar_p - ( RPAREN )) >> RPAREN
>            ];
>          subscript
>            = boost::spirit::inner_node_d[
>              LPAREN >> *(
>                (boost::spirit::anychar_p - ( CMDSWITCH | RPAREN ))
>                |command
>              ) >> RPAREN
>            ];
>        }

Perhaps something like (untested, not currently at home, but should be valid):
    {
        using boost::spirit::qi; // I am lazy
        using boost::spirit::ascii; // assuming ascii encoding

        script // main rule
            = command
            | char_
            ;

        command
            = "^!"
>> ( for_each
                | eps // why an eps, why not just fail out?
                )
            ;

        for_each
            = no_case["for-each"]
>> skip(space)
                [ query
>> subscript
                ]
            ;

        query
            = '('
>> raw[stringparen_inner]
>> ')'
            ;

        subscript
            = '('
>> ( command
                | stringparen_inner // command eats the possible
"^!" first, no need to test
                )
>> ')'
            ;

        stringparen_inner
            = ('(' >> stringparen_inner >> ')')
            | ~char_(')')
            ;
       }

Do note, the above is written in the latest version of Spirit,
where-as yours was written in the ancient and slower (and more
verbose) version. That should handle nested parenthesis and all just
fine.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net