Boost logo

Boost Users :

Subject: Re: [Boost-users] Boost-regex: Weird behaviour with non-greedy matching operator in regex_replace in boost 1.40?
From: Florian Schwarz (florian.schwarz_at_[hidden])
Date: 2009-09-24 05:29:37


Thanks a lot for the detailed response. I was so sure that regex_replace
would replace by default only the first occurence and not all so that I
didn't look into the documentation, therefor I didn't understand your
first explanation :-(
So now I see my mistake...

Best regards
Florian

John Maddock wrote:
>> Somehow I just don't get it.
>> When I match "hallo" with "(.*?)o?" and "xhallo" with "x(.*?)o?", I
>> expect that $1 will in both cases be the same. But this is not the case.
>> In the former the result is "hall" while in the later its "hallo", which
>> seems weird to me...
>
> No that's not what's happening, remember the .*? part is non-greedy
> and will match as few characters as possible (zero if possible) that
> still results in an overall match. Consider the program below that
> enumerates all the possible matches in the string - this is what
> regex_replace basically does internally - but in this case you get to
> see all the individual matches, output is as follows:
>
> Enumerating all the matches of "(.*?)o?" in the text "Hallo"
> $0 = "" $1 = "" Position = 0
> $0 = "H" $1 = "H" Position = 0
> $0 = "" $1 = "" Position = 1
> $0 = "a" $1 = "a" Position = 1
> $0 = "" $1 = "" Position = 2
> $0 = "l" $1 = "l" Position = 2
> $0 = "" $1 = "" Position = 3
> $0 = "lo" $1 = "l" Position = 3
> $0 = "" $1 = "" Position = 5
>
> Enumerating all the matches of "x(.*?)o?" in the text "xHallo"
> $0 = "x" $1 = "" Position = 0
>
> So in this latter case there is only one match found, and in the case
> or regex_replace the unmatched part (all of "Hallo") gets output
> unchanged.
>
> Here's the example program:
>
> int main ( int argc, char** argv )
> {
> std::string input = "xHallo";
> boost::regex test ( "x(.*?)o?" );
> boost::sregex_iterator it ( input.begin(), input.end (), test);
> boost::sregex_iterator none;
>
> std::cout << "Enumerating all the matches of \"" << test.str() << "\"
> in the text \"" << input << "\"" << std::endl;
>
> while ( it != none )
> {
> std::cout << "$0 = \"" << it->str(0) << "\" $1 = \"" << it->str(1) <<
> "\" Position = " << it->position() << std::endl;
> ++it;
> }
> return 0;
> }
>
> HTH, John.
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net