|
Boost Users : |
From: Dean (dean_at_[hidden])
Date: 2003-04-24 18:37:34
Hi all,
I'm using regexp from boost-1.28 and experience the following
behavior. Consider the following (somewhat artificial) regexp
pattern:
a{1}b
As I expected, that pattern is found in "ab" but not "aab".
To my suprise however, the same pattern *is* found
in "aaab", "aaaaab", and any other string consisting of an odd number
of "a"s followed by a "b". It is not found in strings consisting of
an even number of "a"s followed by a "b". This seems odd (no pun
intended).
I see the same sort of behavior with quantifiers other than "{1}" and
where the quantified expression matches other single characters.
(Oddly enough, the behavior changes when using a quantified
expression that matches multiple characters. "(ab){1}c" is found
in "abc", "ababc", "abababc", and any other string containing "abc".)
FWIW, I first observed the behavior when trying to find social
security numbers with the following pattern:
\d{3}-\d{2}-\d{4}
As expected, that pattern was found in "123-12-1234" but not in "1234-
12-1234". However it *was* found in "1234567-12-1234".
Is this behavior by design or is it a bug?
If it's a bug, has it been fixed in a subsequent boost release? Also
what is the correct behavior? Should "a{1}b" be found in "aab"
(albeit starting at the second character)?
FWIW, it's easy enough for me to workaround the current behavior with
a pattern like this:
(^|[^a])a{1}b
Thanks,
--Dean
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net