Boost logo

Boost :

From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2006-11-21 21:05:54


Sorry, the first message got send out too early...

> David Abrahams wrote:
>
> > > Yes, and Slex is the other one
> >
> > Not to mention XPressive?
>
> Xpressive is not really usable as a lexer, and Eric is aware of that.
> I have a Wave lexer implemented with Xpressive here on my hard disk,
> and it functions well, it is only 3 magnitudes slower as for instance
> the re2c based one. The main reasons are:
>
> - no optimization between different regex's used for token
> representation (no internal NFA/DFA generation)
> - no way to tell which alternative matched if using regex's containing
> alternatives
>
> The first rules out using separate regex's, one for each token, the
> second one inhibits us from using one giant regex with alternatives...
>
> Both are probably merely natural restrictions stemmed from the fact
> Xpressive is a regex library not a lexer generator.
> The same issues would probably occur if we were trying to use
> Boost.Regex for this task.

FYI, I found my old timings of the different lexer types:

Timing results for the different lexer types included with Wave:

               Re2C Slex Xlex
============================================================================
===
All C++ tokens, lexer get's intstantiated for every C++ token
----------------------------------------------------------------------------

---
1000 times     1.63[s]          2.08[s]           1047.60[s]
                                                   751.57[s] (hoisted
regex_match struct)
============================================================================
===
Regards Hartmut

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk