Boost logo

Boost Users :

From: elviin (elviin_at_[hidden])
Date: 2006-01-26 12:17:29


My original problem is the following code:
I'd like to parse a C++ function definition. I'm using file_iterator
to read a c++ file.
The parser that parses the entire c++ code is called default_parser.
The parser that matches only C++ method is called method_parser. I
know it needs some tuning but I use that problem to improve my
knowledge of spirit.

"Supress_code" is parser that every time precedes C++ function
definition(;, }, #define, */, //).
 "residue_parser" is everything but C++ function definition or supress_code.

e.g.
I'd like to parse:
template<T,X> void function(ddd::ssss<ddd> ss, dfdf dfdf) {
}

::::::::::::::::part of the
grammar::::::::::::::::::::::::::::::::::::::::::::::::
           method_parser =
                   !template_parser
>>
                   !(*space_p >> str_p("inline") >> *space_p)
>>
                   argument_parser
>>
                   identifier_name
>>
                   ch_p('(')
>>
                   !list_p(argument_parser, ch_p(','))
>>
                   ch_p(')')
>>
                   !member_initialization_p
>>
                   !(*space_p >> (str_p("const")|str_p("throw()")))>> *space_p
>>
                   cpp_code_block_p
               ;

           residue_parser = *(anychar_p - (supress_code >> method_parser));

           default_parser =
                   token_node_d[residue_parser]
               %
                   (token_node_d[supress_code] >> token_node_d[method_parser])
               ;

I also deifned
        rule<ScannerT, parser_context<>, parser_tag<supress_codeID> >
supress_code;
        rule<ScannerT, parser_context<>, parser_tag<residue_parserID>
> residue_parser;
        rule<ScannerT, parser_context<>, parser_tag<method_parserID> >
method_parser;
        rule<ScannerT, parser_context<>, parser_tag<default_parserID>
> default_parser;
        rule<ScannerT, parser_context<>, parser_tag<default_parserID> > const&
        start() const { return default_parser; } //the default parser

in the grammar.

Plus invocation of the function xml-tree generation the tree. The
output is the following xml code:

<parsetree version="1.0">
    <parsenode rule="default_parser">
        <parsenode rule="residue_parser">
        </parsenode>
        <parsenode rule="supress_code">
        </parsenode>
        <parsenode rule="method_parser">
        </parsenode>
        <parsenode rule="residue_parser">
        </parsenode>
        <parsenode rule="supress_code">
        </parsenode>
        <parsenode rule="method_parser">
        </parsenode>
        ...........
        .........
        .....
        ...
        <parsenode rule="residue_parser">
        </parsenode>
        <parsenode rule="supress_code">
        </parsenode>
        <parsenode rule="method_parser">
        </parsenode>
        <parsenode rule="residue_parser">
        </parsenode>
    </parsenode>
</parsetree>
Parse succeeded!

With additional message proving that the file was parsed successfully.
The structure of the tree is correct, that's excatly what I want, but
I'm missing any string included between entities <value> and </value>.

I'm expecting something like this:

        <parsenode rule="method_parser">
            <value> "template<T,X> void function(ddd::ssss<ddd> ss,
dfdf dfdf) {
}"
            </value>
        </parsenode>

On 26/01/06, Hartmut Kaiser <hartmut.kaiser_at_[hidden]> wrote:
>
> Elviin wrote:
>
> > I'd like to use non-ast tree. I'm also using the tree_to_xml
> > function to generate xml code.
> > Plus my own grammar. The problem is with the token_node_d
> > directive. I assumed that this directive merge the parsed
> > code to an one node. At this time my parser works for me very
> > fine except the tree because there are missing value objects
> > in xml structure. So it seems that the whole tree is empty:/
> >
> > I can reproduce that state with the example file parse_tree_calc1.cpp:
>
> IIRC, the token_node_d or leaf_node_d cannot be used with a rule inside the
> []. But I'm not sure anymore about the rationale behind this. I'm CC'ing to
> Dan Nuffer, he's the original author of this code, perhaps he has some
> additional information.
>
> Regards Hartmut
>
> >
> > The correct output from the tree_to_xml function is following
> > when parsing "1":
> >
> > <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE
> > parsetree SYSTEM "parsetree.dtd">
> > <!-- 1 -->
> > <parsetree version="1.0">
> > <parsenode rule="expression">
> > <parsenode rule="term">
> > <parsenode rule="factor">
> > <parsenode rule="integer">
> > <parsenode rule="integer">
> > <value>1</value> <<<<======== it's OK here
> > </parsenode>
> > </parsenode>
> > </parsenode>
> > </parsenode>
> > </parsenode>
> > </parsetree>
> > parsing succeeded
> >
> > ... and the corresponfing code/rules:
> >
> > integer = lexeme_d[ token_node_d[ (!ch_p('-') >>
> > +digit_p) ] ];
> > factor = integer
> > | '(' >> expression >> ')'
> > | ('-' >> factor);
> > term = factor >>
> > *( ('*' >> factor)
> > | ('/' >> factor)
> > );
> > expression = term >>
> > *( ('+' >> term)
> > | ('-' >> term)
> > );
> >
> >
> >
> >
> > But if I move the token_node_d directive to another position,
> > so that is similar to the problem with my grammar:
> >
> > integer = lexeme_d[ (!ch_p('-') >> +digit_p) ];
> > factor = token_node_d[integer]
> > | '(' >> expression >> ')'
> > | ('-' >> factor);
> > term = factor >>
> > *( ('*' >> factor)
> > | ('/' >> factor)
> > );
> > expression = term >>
> > *( ('+' >> term)
> > | ('-' >> term)
> > );
> >
> >
> > ... then the XML structure looks like this. The "value"
> > object are missing:
> >
> > <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE
> > parsetree SYSTEM "parsetree.dtd">
> > <!-- 1 -->
> > <parsetree version="1.0">
> > <parsenode rule="expression">
> > <parsenode rule="term">
> > <parsenode rule="factor">
> > <parsenode rule="integer">
> >
> > <<<<======== it is not OK here
> > </parsenode>
> > </parsenode>
> > </parsenode>
> > </parsenode>
> > </parsetree>
> > parsing succeeded
> >
> >
> > Can anyone explain me why it is not possible to use the
> > directive token_node_d like this token_node_d[integer]? I
> > just want to group all the parsed text (integer) in one node.
> > May I have to change the policy or to create proper typedefs.
> >
> > The problem is with hierarchy, that I want to group the
> > parsed text only in the higher levels of parsing process.
> >
> > Thank you.
> >
> > Elviin
> >
> > _______________________________________________
> > Boost-users mailing list
> > Boost-users_at_[hidden]
> > http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net