Boost logo

Boost :

From: Richard Hadsell (hadsell_at_[hidden])
Date: 2004-12-27 12:24:03


Dave Handley wrote:

> A colleague and I are working on a fast lexical analyser designed to
> work alongside Spirit. We currently have some working prototypes, and
> as such I thought that I would gauge interest in this library. We
> plan to produce a DFA based lexical analyser that provides output as a
> set of iterable polymorphic flyweighted tokens. These could then be
> provided as input to Spirit instead of character iterators (albeit
> with the addition of a token_p parser in Spirit). The objectives of
> our project are as follows:
>
> 1) As fast as lex/flex.
> 2) Simple to use
> 3) Rules to generate tokens can be provided both statically and
> dynamically. Static definition would be through an offline
> pre-process stage much like lex.
> 4) Easy to interface with Spirit.

I am looking for exactly what you propose. I have been trying to
replace my Lex/Yacc parser with Spirit but have run into performance as
the major problem.

I broke up the problem into 3 steps. In the first phase the program
uses a Spirit grammar to generate a list of tokens with info similar to
the info generated by Lex. In phase 2 another Spirit grammar generates
a parse tree from the tokens with rules equivalent to the Yacc rules.
Phase 3 evaluates the parse tree with results equivalent to the Yacc
actions.

After getting the bugs out, I compared the performance and found that
the total time for processing a test case of input scripts took 6 times
as long using this scheme. Since some of the time is spent in running
the virtual machine that follows the parsing steps, the actual Spirit
performance is much worse.

I found that much of the slowdown was due to Phase 1, the grammar that
scans the text input a la Lex. After a lot of tweaking I was able to
bring down the performance hit to a bit over 3 times worse than what I
am trying to replace. However, this is still unacceptable. I am
disappointed, because I was hoping to use Spirit or something like it,
to give me some independence from Lex/Yacc's dictatorial control of the
input source.

Your project sounds like it would solve the worst of the problems I have
in trying to move to Spirit.

-- 
Dick Hadsell			914-259-6320  Fax: 914-259-6499
Reply-to:			hadsell_at_[hidden]
Blue Sky Studios                http://www.blueskystudios.com
44 South Broadway, White Plains, NY 10601

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk