Boost logo

Boost :

Subject: Re: [boost] [regex, xpressive] interesting(?) perf benchmark
From: John Maddock (boost.regex_at_[hidden])
Date: 2010-06-07 11:17:51


>> Just an idea: for static xpressive, couldn't you detect at compile-time
>> that the expression is truly regular, and use a DFA in that case?
>
> Oh, sure! Why don't you submit a DFA and I'll use it in xpressive? ;-)
>
> After nosing about on the internet a bit more, I found this interesting
> comparison:
>
> http://shootout.alioth.debian.org/u32/benchmark.php?test=regexdna&lang=all
>
> Here we see every language compared on how well it can perform on a
> particular regex task. The top-performer is <drumroll> Google's
> JavaScript V8 engine! Wow. C++ is in 5th place. The fastest C++ program
> submitted to the competition uses static xpressive <pats own back>. I'm
> not so upset about being beaten by V8. It adaptively improves its native
> codegen *at runtime*. What really bugs me is that we're skunked by a C
> library: Tcl. Grrrr. I've read a bit about Tcl's regex library; it does
> what Mathias is suggesting: implements both a DFA and an NFA, analyzes
> the pattern and chooses which to use. I've known for a while that this
> is the way forward, but I just don't have the time for that. (Wasn't
> there a GSoC project to do that for Boost.Regex?)

My memory fails me.... In any case the regex GSOC project never got off the
ground.

Nosing around the entries to the competition, I wonder how much of the
performance difference is down to the regex engine, and how much to other
tricks the entries use: for example I notice the top C program uses a thread
pool to conduct everything in parallel. Cheating I say! ;-)

John.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk