Boost logo

Boost Users :

From: Dean (dean_at_[hidden])
Date: 2003-04-25 12:35:06


--- In Boost-Users_at_[hidden], "Joshua B. Smith" <josh_at_n...>
wrote:
> On Thu, Apr 24, 2003 at 11:37:34PM -0000, Dean wrote:
> > Hi all,
> >
> > ... snip ...
> >
> > \d{3}-\d{2}-\d{4}
> >
> > As expected, that pattern was found in "123-12-1234" but not
in "1234-
> > 12-1234". However it *was* found in "1234567-12-1234".
> >
> > Is this behavior by design or is it a bug?
>
> It is (probably) design. Intervals can specify a min and a max,
for example:
>
> \d{3,3}-\d{2,2}-\d{4,4} will match "123-12-1234" but NOT "1234567-
12-1234".
> \d{3}-\d{2}-\d{4} will match "123-12-1234" but NOT "1234567-12-
1234" also.

I'm not sure what you were trying to say above, but my understanding
is that the 2 patterns you just mentioned are equivalent. The docs
say "{3}" is equivalent to "{3,3}" not "{3,}".

>
> It will, however, yeild a correct search (there is a difference
between search
> and match). You didn't mention if you were doing a regex_search or
> regex_match ?

I'm doing a search because I don't want to know whether the whole
string matches but whether the regex is found in the string.
Specifically, I'm doing:

m_regex.Search( sampleBody, boost::match_default | boost::match_any)

While I can believe that the design intention was that "\d{3}-"
should be found in "1234567-" (at the fifth character), it seems
inconsistent that it is *not* also found in "123456-" and "12345678-
". I'm seeing that inconsistent behavior.

> > FWIW, it's easy enough for me to workaround the current behavior
with
> > a pattern like this:
> >
> > (^|[^a])a{1}b
>
> You could use this, but I wouldn't recomend it (but that's just
me...regex
> construction is deeply personal :) ). HTH.

I realize there is more than one way to do it, and I'd be interested
in what you'd recommend.

FWIW, in our SSN-matching case, we'll probably just use "\b\d{3}-\d
{2}-\d{4}\b".

Thanks for the reply!

--Dean


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net