From: Hartmut Kaiser (hartmutkaiser_at_[hidden])
Date: 2004-12-27 16:21:45
Dave Handley wrote:
> The grammar for Spirit was (in a slightly cut down form):
> keyword =
> str_p( "Group" ) |
> str_p( "Separator" ) |
> comment =
> ch_p( '#' ) >> * ( ~chset_p( "\n\r" ) ) >> chset_p( "\n\r"
> ) ]; stringLiteral = lexeme_d[
> ch_p( '\"' ) >> * ( ~chset_p( "\"\n\r" ) ) >> chset_p(
> "\"\n\r" ) ]; word = lexeme_d[
> ( alpha_p | ch_p( '_' ) ) >>
> * ( alnum_p | ch_p( '_' ) )
> floatNum =
> vrml = *( keyword | comment | stringLiteral | word | floatNum );
> I've cut down the keywords because there are over 60 of them.
> I would be interested to know if there was any obvious ways
> to optimise the Spirit parser.
At least you could have used the symbol parser (look here:
http://www.boost.org/libs/spirit/doc/symbols.html), which is a deterministic
parser usable especially for keyword matching. I'm pretty sure, that this
alone would speed up your test case a lot, because your keyword rule from
above (if it contains 60 alternatives) is a _very_ ineffective way to
recognise known in advance keywords.
> 1) An extensible object oriented approach to writing the
> entire system.
> This can be very useful if you want to handle something like
> say the parsing of a filename in the lexer. You can simply
> write a new token type that will split the incoming filename
> into path, extension, name, etc. This can massively simplify
> the production of a final parser - allowing you to deal with
> grammar issues at that stage.
> 2) DFAs can be created at run-time, or eventually compile-time.
> 3) The code is considerably less obfuscated than the code
> produced by
> flex. Don't get me wrong, I like flex a lot, but the
> pre-processor directives, and look-up tables in the generated
> code are pretty unreadable IMHO.
Is there any documentation available?
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk