Boost logo

Boost Users :

Subject: Re: [Boost-users] [regex] why partial match / early break ?
From: John Maddock (boost.regex_at_[hidden])
Date: 2011-11-01 14:15:04


> why is this regex
> const string sRe =
> "((([a-zA-Z]|([a-zA-Z][a-zA-Z0-9\\-]))+[a-zA-Z0-9])\\.)+"
> "((([a-zA-Z]|([a-zA-Z][a-zA-Z0-9\\-]))+[a-zA-Z0-9]))";
>
> not matching this string wholly?
> "a1a.a2a.a3a.a4aaaa"
>
> It rather matches only this part:
> "a1a.a2a.a3a.a4"
>
> What's the problem here?
> (I know I can append a delimiter to solve the problem,
> but I think the regex should've match it wholly, shouldn't it?)

No, in Perl mode, early alternatives are preferred to later ones, so given:

a|aa

against:

aaaa

will only match the first "a".

This is what's happening in your expression.

HTH, John.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net