Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2005-09-18 12:47:29

John Maddock wrote:
>>I wonder why your results are so different from mine. Please post the
>>code for your modified test so I can run it locally. Thanks.
> It's in cvs now under the usual libs/regex/performance path: there are still
> some problems with the html output that I don't understand yet, but I
> haven't had a chance to look at those.
> As for the results being different from yours, I have noticed that the
> results can differ quite a bit from run to run, particularly the ftp
> response expression "^([0-9]+)(\-| |$)(.*)$" I've seen either Boost.Regex or
> xpressive win out by some margin depending on the machine setup.

Right off, I spotted a couple of problems with your performance test:

1) The xpressive functions all have try/catch blocks in them, but the
boost functions do not.

2) You are not passing the "optimize" syntax_option_type flag to
xpressive's regex constructor.

3) People who care about performance *will* take the time to rewrite
their patterns as static regexes, so a perf test that excludes static
xpressive is less interesting.

I fixed the first two problems and took the liberty of committing my
changes. (In retrospect, a patch would have been the polite thing to do.
Sorry.) I'll work on adding a test for static xpressive, too.

After fixing these problems, the numbers for the short-matches comes out
as I expected - dynamic xpressive is ahead by as much as 2x or more. The
HTML search surprises me a bit -- xpressive does poorly. It could be
related to the fact that this is a case-insensitive search. It's
possible I have a bug, or it's possible that the silly things I had to
do to make Boyer-Moore work with the regex traits interface make
Boyer-Moore more trouble than its worth for case-insensitive matches. I
haven't yet run the other tests.

Testing: "abc" against "abc"
         Boost regex (C++ locale): 3.6478e-007s
         xpressive regex: 1.49012e-007s
Testing: "^([0-9]+)(\-| |$)(.*)$" against "100- this is a line of ftp
response which contains a message string"
         Boost regex (C++ locale): 7.59125e-007s
         xpressive regex: 4.46796e-007s
Testing: "([[:digit:]]{4}[- ]){3}[[:digit:]]{3,4}" against
         Boost regex (C++ locale): 1.13106e-006s
         xpressive regex: 7.00951e-007s
against "john_at_[hidden]"
         Boost regex (C++ locale): 1.75667e-006s
         xpressive regex: 1.34087e-006s
against "foo12_at_[hidden]"
         Boost regex (C++ locale): 1.48964e-006s
         xpressive regex: 1.19209e-006s
against "bob.smith_at_[hidden]"
         Boost regex (C++ locale): 1.52016e-006s
         xpressive regex: 1.16158e-006s
Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$"
against "EH10 2QQ"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.20435e-007s
Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$"
against "G1 1AA"
         Boost regex (C++ locale): 5.80788e-007s
         xpressive regex: 3.20435e-007s
Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$"
against "SW1 1ZZ"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.35217e-007s
Testing: "^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$" against
         Boost regex (C++ locale): 5.36919e-007s
         xpressive regex: 3.12805e-007s
Testing: "^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$" against
         Boost regex (C++ locale): 5.51224e-007s
         xpressive regex: 3.12805e-007s
Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "123"
         Boost regex (C++ locale): 5.65529e-007s
         xpressive regex: 2.98023e-007s
Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "+3.14159"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.49998e-007s
Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "-3.14159"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.42369e-007s
         Boost regex (C++ locale): 0.00012207s
         xpressive regex: 0.000217529s
Testing: (^[
         Boost regex (C++ locale): 0.00426563s
Exception: mismatched parenthesis
         xpressive regex: -1s
Testing: ^[ ]*#[ ]*include[ ]+("[^"]+"|<[^>]+>)
         Boost regex (C++ locale): 0.000183105s
         xpressive regex: 0.000213623s
Testing: ^[ ]*#[ ]*include[ ]+("boost/[^"]+"|<boost/[^>]+>)
         Boost regex (C++ locale): 0.000183105s
         xpressive regex: 0.000213623s
Testing: beman|john|dave
         Boost regex (C++ locale): 0.000251465s
         xpressive regex: 0.0003125s
Testing: <p>.*?</p>
         Boost regex (C++ locale): 0.00019458s
         xpressive regex: 0.00074707s
Testing: <a[^>]+href=("[^"]*"|[^[:space:]]+)[^>]*>
         Boost regex (C++ locale): 0.000716797s
         xpressive regex: 0.00167773s
Testing: <h[12345678][^>]*>.*?</h[12345678]>
         Boost regex (C++ locale): 0.000202148s
         xpressive regex: 0.00103711s
Testing: <img[^>]+src=("[^"]*"|[^[:space:]]+)[^>]*>
         Boost regex (C++ locale): 0.000206055s
         xpressive regex: 0.000533203s
Testing: <font[^>]+face=("[^"]*"|[^[:space:]]+)[^>]*>.*?</font>
         Boost regex (C++ locale): 0.000213623s
         xpressive regex: 0.000465332s

Eric Niebler
Boost Consulting

Boost list run by bdawes at, gregod at, cpdaniel at, john at