Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2005-09-18 12:47:29


John Maddock wrote:
>>I wonder why your results are so different from mine. Please post the
>>code for your modified test so I can run it locally. Thanks.
>
>
> It's in cvs now under the usual libs/regex/performance path: there are still
> some problems with the html output that I don't understand yet, but I
> haven't had a chance to look at those.
>
> As for the results being different from yours, I have noticed that the
> results can differ quite a bit from run to run, particularly the ftp
> response expression "^([0-9]+)(\-| |$)(.*)$" I've seen either Boost.Regex or
> xpressive win out by some margin depending on the machine setup.

Right off, I spotted a couple of problems with your performance test:

1) The xpressive functions all have try/catch blocks in them, but the
boost functions do not.

2) You are not passing the "optimize" syntax_option_type flag to
xpressive's regex constructor.

3) People who care about performance *will* take the time to rewrite
their patterns as static regexes, so a perf test that excludes static
xpressive is less interesting.

I fixed the first two problems and took the liberty of committing my
changes. (In retrospect, a patch would have been the polite thing to do.
Sorry.) I'll work on adding a test for static xpressive, too.

After fixing these problems, the numbers for the short-matches comes out
as I expected - dynamic xpressive is ahead by as much as 2x or more. The
HTML search surprises me a bit -- xpressive does poorly. It could be
related to the fact that this is a case-insensitive search. It's
possible I have a bug, or it's possible that the silly things I had to
do to make Boyer-Moore work with the regex traits interface make
Boyer-Moore more trouble than its worth for case-insensitive matches. I
haven't yet run the other tests.

Testing: "abc" against "abc"
         Boost regex (C++ locale): 3.6478e-007s
         xpressive regex: 1.49012e-007s
Testing: "^([0-9]+)(\-| |$)(.*)$" against "100- this is a line of ftp
response which contains a message string"
         Boost regex (C++ locale): 7.59125e-007s
         xpressive regex: 4.46796e-007s
Testing: "([[:digit:]]{4}[- ]){3}[[:digit:]]{3,4}" against
"1234-5678-1234-456"
         Boost regex (C++ locale): 1.13106e-006s
         xpressive regex: 7.00951e-007s
Testing:
"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"
against "john_at_[hidden]"
         Boost regex (C++ locale): 1.75667e-006s
         xpressive regex: 1.34087e-006s
Testing:
"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"
against "foo12_at_[hidden]"
         Boost regex (C++ locale): 1.48964e-006s
         xpressive regex: 1.19209e-006s
Testing:
"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"
against "bob.smith_at_[hidden]"
         Boost regex (C++ locale): 1.52016e-006s
         xpressive regex: 1.16158e-006s
Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$"
against "EH10 2QQ"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.20435e-007s
Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$"
against "G1 1AA"
         Boost regex (C++ locale): 5.80788e-007s
         xpressive regex: 3.20435e-007s
Testing: "^[a-zA-Z]{1,2}[0-9][0-9A-Za-z]{0,1} {0,1}[0-9][A-Za-z]{2}$"
against "SW1 1ZZ"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.35217e-007s
Testing: "^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$" against
"4/1/2001"
         Boost regex (C++ locale): 5.36919e-007s
         xpressive regex: 3.12805e-007s
Testing: "^[[:digit:]]{1,2}/[[:digit:]]{1,2}/[[:digit:]]{4}$" against
"12/12/2001"
         Boost regex (C++ locale): 5.51224e-007s
         xpressive regex: 3.12805e-007s
Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "123"
         Boost regex (C++ locale): 5.65529e-007s
         xpressive regex: 2.98023e-007s
Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "+3.14159"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.49998e-007s
Testing: "^[-+]?[[:digit:]]*\.?[[:digit:]]*$" against "-3.14159"
         Boost regex (C++ locale): 5.96046e-007s
         xpressive regex: 3.42369e-007s
Testing:
^(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([
      ]*\(
[^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>[[:space:]]*)?(\{|:[^;\{()]*\{)
         Boost regex (C++ locale): 0.00012207s
         xpressive regex: 0.000217529s
Testing: (^[
]*#(?:[^\\\n]|\\[^\n_[:punct:][:alnum:]]*[\n[:punct:][:word:]])*)|(//[^\n]*|/\*.*?\*/)|\<([+-]?(?:(?:0x[[:xdigit:]]+)|(?:(?:[[:digit:]]*\.)?[[:digit:]]+(?:[eE][+-]?[[:digit:]]+)?))u?(?:(?:int(?:8|16|32|64))|L)?)\>|('(?:[^\\']|\\.)*'|"(?:[^\\"]|\\.)*")|\<(__asm|__cdecl|__declspec|__export|__far16|__fastcall|__fortran|__import|__pascal|__rtti|__stdcall|_asm|_cdecl|__except|_export|_far16|_fastcall|__finally|_fortran|_import|_pascal|_stdcall|__thread|__try|asm|auto|bool|break|case|catch|cdecl|char|class|const|const_cast|continue|default|delete|do|double|dynamic_cast|else|enum|explicit|extern|false|float|for|friend|goto|if|inline|int|long|mutable|namespace|new|operator|pascal|private|protected|public|register|reinterpret_cast|return|short|signed|sizeof|static|static_cast|struct|switch|template|this|throw|true|try|typedef|typeid|typename|union|unsigned|using|virtual|void|volatile|wchar_t|while)\>
         Boost regex (C++ locale): 0.00426563s
Exception: mismatched parenthesis
         xpressive regex: -1s
Testing: ^[ ]*#[ ]*include[ ]+("[^"]+"|<[^>]+>)
         Boost regex (C++ locale): 0.000183105s
         xpressive regex: 0.000213623s
Testing: ^[ ]*#[ ]*include[ ]+("boost/[^"]+"|<boost/[^>]+>)
         Boost regex (C++ locale): 0.000183105s
         xpressive regex: 0.000213623s
Testing: beman|john|dave
         Boost regex (C++ locale): 0.000251465s
         xpressive regex: 0.0003125s
Testing: <p>.*?</p>
         Boost regex (C++ locale): 0.00019458s
         xpressive regex: 0.00074707s
Testing: <a[^>]+href=("[^"]*"|[^[:space:]]+)[^>]*>
         Boost regex (C++ locale): 0.000716797s
         xpressive regex: 0.00167773s
Testing: <h[12345678][^>]*>.*?</h[12345678]>
         Boost regex (C++ locale): 0.000202148s
         xpressive regex: 0.00103711s
Testing: <img[^>]+src=("[^"]*"|[^[:space:]]+)[^>]*>
         Boost regex (C++ locale): 0.000206055s
         xpressive regex: 0.000533203s
Testing: <font[^>]+face=("[^"]*"|[^[:space:]]+)[^>]*>.*?</font>
         Boost regex (C++ locale): 0.000213623s
         xpressive regex: 0.000465332s

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk