Boost logo

Boost :

From: John Maddock (John_Maddock_at_[hidden])
Date: 2001-04-05 06:15:53


>In the regular expression ([A-Za-z]*)(0*)([0-9]*), the
>parenthesis are used only to mark what generated the
>match and in this case, the result of the match should
>be the same one as that obtained with the regular
>expression ([A-Za-z]*)0*([0-9]*).
>Why does using of parenthesis into a regular
>expression change result of the match ?

No parenthesis don't only mark - they determine what the best match is as
well. regex++ tries to follow the POSIX standard leftmost longest rule for
determining what matched. So if there is more than one possible match
after considering the whole expression, it looks next at the first
sub-expression and then the second sub-expression and so on.

So...

(0*)([0-9]*) against 00123 would produce
$1 = 00
$2 = 123

but 0*([0-9)* against 00123 would produce
$1 = 00123

Think about it, had $1 only matched the "123", this would be "less good"
than the match "00123" which is both further to the left and longer.

If you want $1 to match only the "123" part, then you need to use something
like:

0*([1-9][0-9]*)

as the expression.

- John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk