Boost logo

Boost Users :

Subject: Re: [Boost-users] Boost-regex: Weird behaviour with non-greedy matching operator in regex_replace in boost 1.40?
From: Florian Schwarz (florian.schwarz_at_[hidden])
Date: 2009-09-24 04:02:14


Somehow I just don't get it.
When I match "hallo" with "(.*?)o?" and "xhallo" with "x(.*?)o?", I
expect that $1 will in both cases be the same. But this is not the case.
In the former the result is "hall" while in the later its "hallo", which
seems weird to me...

Best regards
Florian

John Maddock wrote:
>> I have the following questions:
>> - why does test 1 match the expected "hall" while test 2 matches "hallo"
>> - why does test 1 match the whole string while test 4 matches only a
>> part of it.
>
> Because that's the way that Perl regexes work, if you have the
> expression (.*?)o? then for preference the .*? part will match *no
> characters at all*, so basically your expression either matches no
> characters, or one character if the next character is an "o". So
> since you're doing a search and replace, the effect is:
>
> * If the next character is not an "o", match a zero length string and
> output a null string (the contents of $1).
> * Since the last match was against a zero length string, then skip
> to the next character.
> * Otherwise if the next character is an "o", match it and output $1 -
> again this is an empty string.
> * Move to the end of the string matched.
> * Find the next match and output all unmatched text (everything from
> the end of the last match to the start of this one).
> * Repeat.
>
> So in effect we end up deleting all the letter "o"'s.
>
> Or at least I think that's what's going on here after a very brief
> look ;-)
>
> HTH, John.
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net