Re: [Boost-bugs] [Boost C++ Libraries] #8392: Too complex regex for boost::regex (perl, extended) isn't so complex

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #8392: Too complex regex for boost::regex (perl, extended) isn't so complex
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2013-04-20 16:48:45


#8392: Too complex regex for boost::regex (perl, extended) isn't so complex
-------------------------------+--------------------------------------------
  Reporter: bkalinczuk@… | Owner: johnmaddock
      Type: Bugs | Status: closed
 Milestone: To Be Determined | Component: regex
   Version: Boost 1.52.0 | Severity: Problem
Resolution: wontfix | Keywords:
-------------------------------+--------------------------------------------
Changes (by johnmaddock):

  * status: new => closed
  * resolution: => wontfix

Comment:

 This is not nearly so clear cut as you think, both Boost.Regex and Perl
 are backtracking NFA's and if I make the string being matched slightly
 longer then Perl takes several minutes to figure out that the string can't
 be matched:

 {{{


 if
 ("bbbbbbbccccccccccccccccccccccccbbbbbbbbbbbbbbbbcccccccccccccccccccccbbbbbbbbaaaaa"
 =~ m/[a-e]+[b-f]+[ac-f]+[abd-f]+[a-cef]+[a-df]+$/) {

     print "MATCHED\n";

 } else {

     print "NOT MATCHED\n";

 } if
 ("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbaaaaae"
 =~ m/[a-e]+[b-f]+[ac-f]+[abd-f]+[a-cef]+[a-df]+$/) {

     print "MATCHED\n";

 } else {

     print "NOT MATCHED\n";

 }

 }}}

 Make the string being matched longer still and Perl will go into an
 effectively infinite loop.

 Boost.Regex takes the view that these pathological cases should be caught
 as soon as possible, and that's what you're seeing here. It's true that
 this behavior might not suite everyone all of the time, but it's also
 safer, and helps prevent deliberately "bad" regexes being used in DOS
 attacks etc.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/8392#comment:2>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:12 UTC