Boost logo

Boost Users :

Subject: [Boost-users] Boost Regex Back Reference Issue
From: Nick (nospam_at_[hidden])
Date: 2017-07-13 14:03:45


According to the Boost documentation for Back references
(http://www.boost.org/doc/libs/1_64_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.back_references) which seems to be the same at least from 1.37 - 1.64.

Quoting a piece of the documentation:

-----

For example the expression:

   ^(a*).*\1$

Will match the string:

   aaabbaaa

But not the string:

   aaabba

-----

I'm finding two issues with the example cited.

1. It seems to me that the example of the string which should not
match, should actually match. Ultimately, shouldn't the engine match
the marked sub-expression with the first 'a' in order to satisfy the
backreference?
I tested this example with Oniguruma and with PHP's PCRE and they both
matched the string noted here to not match.

But also, since the marked sub-expression is (a*) then I wonder what the
behavior would be if it couldn't make a match on 'a', since the '*' will
allow for zero matches. In fact, it seems like everything in the
pattern is effectively "optional" due to the '*' operator.

I'm a novice with Perl, but unless I made a mistake, it will match
unconditionally:

print "It matches\n" if "aaabba" =~ /^(a*).*\1$/;
It matches
print "It matches\n" if "aaabbac" =~ /^(a*).*\1$/;
It matches
print "It matches\n" if "" =~ /^(a*).*\1$/;
It matches
print "It matches\n" if "x" =~ /^(a*).*\1$/;
It matches
print "It matches\n" if "xyz" =~ /^(a*).*\1$/;
It matches
print "It matches\n" if "123" =~ /^(a*).*\1$/;
It matches

2. The other issue is that when I try this example with the string
which is posted to not match, the Boost regex engine runs for a while
and ultimately crashes with a memory error. (seems like it might be an
endless loop of some sort). Is that a bug?

Nick


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net