Boost logo

Boost :

From: Dave Handley (dave_at_[hidden])
Date: 2004-12-27 20:05:54


Dick Hadsell wrote:

<snip>
>disappointed, because I was hoping to use Spirit or something like it,
>to give me some independence from Lex/Yacc's dictatorial control of the
>input source.

>Your project sounds like it would solve the worst of the problems I have
>in trying to move to Spirit.

The current API we are working with templatises the input so that in theory
it will work with any character like input, in much the same way as
std::basic_string<>. We are still working on getting the DFA to work
generically, rather than just explicitly with char and wchar_t, but I think
we should have some success. At present the lexer is strongly typed from
this character type in the same way as std::basic_string<> but I don't
necessarily see that as a problem.

>I broke up the problem into 3 steps. In the first phase the program
>uses a Spirit grammar to generate a list of tokens with info similar to
<snip>

Depending on the type of grammar I think you should easily achieve a 6x or
greater performance boost. If the input has long repetitive sections, you
could probably further optimise the stage so that the lexer does most of the
work - for example if you have long lists of numbers or similar. I'm not
sure how well this would work with Spirit until I try it, but it should be
possible to switch control part way through a parse to a very quick and
efficient parser that just throws the tokens at a visitor until a particular
section is finished. I'm sure this could probably be done by writing a new
parser type in Spirit. The idea would be to process long lists of numbers
or strings or similar, where those lists have a clearly defined end token.

Dave Handley


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk