|
Boost Users : |
Subject: Re: [Boost-users] Spirit Newbie (balanced parentheses)
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2010-04-16 21:07:26
On Fri, Apr 16, 2010 at 11:41 AM, EricB <eric.britz_at_[hidden]> wrote:
> Hi all
>
> I'm trying to use spirit in order to make a parser for a simple language.
> Last time i had to do this I was in school (some more than 15 years ago).
> I managed to do  something that works but it does not handle "balanced
> parentheses".
> I know this is a common problem but after having look for every possible
> way of spelling "balanced parentheses" I did not find anything (that I
> could understand).
>
> The files I want to parse are made of ASCII char. They can contains
> commands that should be interpreted (replacement).
> Commands always start with "^!" (command switch).
>
> An example of a file content:
>
> Â Â Â Â text text text ^!for-each( A )( B ) text text text text
>
> I want to send the "text" strings to the std::cout and detect the
> commands "^!for-each( A )( B )" in order to process them.
> At the moment I only have one command "for-each"
> Where:
> A : is a query that can contain /'" and balanced ()
> B : is the text to be sent to output for each result of the query. This
> can be a 'script' meaning text + commands, for example:
> Â Â Â Â aa bb ^!for-each(C)(D) tt yy uu
> therefore B can also contain balanced ().
>
> The rules, I wrote does not handled balanced () very well.
>
> For example the following command should be parsed successfully:
>
> ^!for-each( a/b/c()/e[]/d )
> (
> item [name] do ^!for-each(sub/text()) ( print() )
> )
>
> I need some advices. Thank you
>
> So here are my rules
> Â Â Â {
> Â Â Â Â Â boost::spirit::chlit<> Â Â LPAREN('(');
> Â Â Â Â Â boost::spirit::chlit<> Â Â RPAREN(')');
> Â Â Â Â Â boost::spirit::strlit<> Â Â CMDSWITCH("^!");
>
> Â Â Â Â Â script // main rule
> Â Â Â Â Â Â = * ( (boost::spirit::anychar_p - CMDSWITCH)
> Â Â Â Â Â Â Â | command
> Â Â Â Â Â Â Â );
> Â Â Â Â Â command
> Â Â Â Â Â Â = boost::spirit::discard_first_node_d[
> Â Â Â Â Â Â Â CMDSWITCH
> Â Â Â Â Â Â Â >> (for_each
> Â Â Â Â Â Â Â Â | boost::spirit::eps_p // for error reporting
> Â Â Â Â Â Â Â )
> Â Â Â Â Â Â ];
> Â Â Â Â Â for_each
> Â Â Â Â Â Â = boost::spirit::discard_first_node_d[
> Â Â Â Â Â Â Â boost::spirit::as_lower_d["for-each"]
> Â Â Â Â Â Â Â Â >> *boost::spirit::space_p
> Â Â Â Â Â Â Â Â >> query
> Â Â Â Â Â Â Â Â >> *boost::spirit::space_p
> Â Â Â Â Â Â Â Â >> subscript
> Â Â Â Â Â Â ];
> Â Â Â Â Â query
> Â Â Â Â Â Â = boost::spirit::inner_node_d[
> Â Â Â Â Â Â Â LPAREN >> *(boost::spirit::anychar_p - ( RPAREN )) >> RPAREN
> Â Â Â Â Â Â ];
> Â Â Â Â Â subscript
> Â Â Â Â Â Â = boost::spirit::inner_node_d[
> Â Â Â Â Â Â Â LPAREN >> *(
> Â Â Â Â Â Â Â Â (boost::spirit::anychar_p - ( CMDSWITCH | RPAREN ))
> Â Â Â Â Â Â Â Â |command
> Â Â Â Â Â Â Â ) >> RPAREN
> Â Â Â Â Â Â ];
> Â Â Â Â }
Perhaps something like (untested, not currently at home, but should be valid):
{
using boost::spirit::qi; // I am lazy
using boost::spirit::ascii; // assuming ascii encoding
script // main rule
= command
| char_
;
command
= "^!"
>> ( for_each
| eps // why an eps, why not just fail out?
)
;
for_each
= no_case["for-each"]
>> skip(space)
[ query
>> subscript
]
;
query
= '('
>> raw[stringparen_inner]
>> ')'
;
subscript
= '('
>> ( command
| stringparen_inner // command eats the possible
"^!" first, no need to test
)
>> ')'
;
stringparen_inner
= ('(' >> stringparen_inner >> ')')
| ~char_(')')
;
}
Do note, the above is written in the latest version of Spirit,
where-as yours was written in the ancient and slower (and more
verbose) version. That should handle nested parenthesis and all just
fine.
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net