From: Eric Niebler (eric_at_[hidden])
Date: 2005-09-16 18:46:17
Darren Cook wrote:
>>>In short, xpressive comes out consistently ahead of Boost.Regex on
>>>short matches, and roughly on par for longer matches (with wide
> Interesting. This left me with two questions:
> 1. Why is dynamic quicker than static xpressive on some expressions?
It's only that way for gcc. On VC7.1, static xpressive is always faster.
I can only guess that gcc's optimizer is at fault here.
> 2. Why is boost::regex quicker on longer strings? Something to do with
> buffering or dynamic memory usage?
I haven't fully investigated this, but I suspect that for some of those
patterns, Boost.Regex is finding a clever optimization. I have noticed
that if you change the pattern:
then xpressive is considerably faster than Boost.Regex at finding all
matches. Clearly, I need to be testing more patterns to make sure the
results are representative.
> I thought "Huck[[:alpha:]]+" (expressive twice as quick) vs.
> "[[:alpha:]]+ing" (boost::regex twice as quick) was very curious. Is
> this due to some design decision, or just something waiting to be optimized?
This is a case where xpressive is finding a clever optimization that
Boost.Regex is missing. When a pattern begins with a string literal,
xpressive uses Boyer-Moore. It's a huge win.
I have no idea why Boost.Regex is faster at matching "[[:alpha:]]+ing".
It's worth looking in to.
>>>Agreed. FYI, "_" matches any one character. ~_n matches any character
>>>that is not '\n'. I also need to describe _ln which matches a logical
>>>newline (eg., "\n" or "\r" or "\r\n" or other line separators) and
>>>~_ln which matches any one character that is not a line separator.
> _ln sounds useful. Is that in perl/PCRE ?
I don't recall where I got that idea. Perhaps from Perl 6.
-- Eric Niebler Boost Consulting www.boost-consulting.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk