Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2005-09-16 18:46:17

Darren Cook wrote:
>>>In short, xpressive comes out consistently ahead of Boost.Regex on
>>>short matches, and roughly on par for longer matches (with wide
> Interesting. This left me with two questions:
> 1. Why is dynamic quicker than static xpressive on some expressions?

It's only that way for gcc. On VC7.1, static xpressive is always faster.
I can only guess that gcc's optimizer is at fault here.

> 2. Why is boost::regex quicker on longer strings? Something to do with
> buffering or dynamic memory usage?

I haven't fully investigated this, but I suspect that for some of those
patterns, Boost.Regex is finding a clever optimization. I have noticed
that if you change the pattern:




then xpressive is considerably faster than Boost.Regex at finding all
matches. Clearly, I need to be testing more patterns to make sure the
results are representative.

> I thought "Huck[[:alpha:]]+" (expressive twice as quick) vs.
> "[[:alpha:]]+ing" (boost::regex twice as quick) was very curious. Is
> this due to some design decision, or just something waiting to be optimized?

This is a case where xpressive is finding a clever optimization that
Boost.Regex is missing. When a pattern begins with a string literal,
xpressive uses Boyer-Moore. It's a huge win.

I have no idea why Boost.Regex is faster at matching "[[:alpha:]]+ing".
It's worth looking in to.

>>>Agreed. FYI, "_" matches any one character. ~_n matches any character
>>>that is not '\n'. I also need to describe _ln which matches a logical
>>>newline (eg., "\n" or "\r" or "\r\n" or other line separators) and
>>>~_ln which matches any one character that is not a line separator.
> _ln sounds useful. Is that in perl/PCRE ?

I don't recall where I got that idea. Perhaps from Perl 6.

Eric Niebler
Boost Consulting

Boost list run by bdawes at, gregod at, cpdaniel at, john at