|
Boost Users : |
From: John Maddock (john_at_[hidden])
Date: 2004-07-21 05:44:22
> using the latest regex patch and vc7.1 i've accidentally encountered the
> following:
>
> ...
> regex re("([^\n]*\\n+\\s+)+NEEDEDSUBITEM2:[^\\s]");
> bool matched = regex_search(text, re); // bad_expression
I think the problem is that the first repeated section:
([^\n]*\\n+\\s+)+
starts and ends with repeats either of which can match repeated whitespace -
this is what causes the matcher to thrash trying to find a match, eventually
leading to it giving up and throwing an exception, I think you could make
your expression much more precise by using:
regex re("([^\n]*\\n+)+\\s+NEEDEDSUBITEM2:[^\\s]");
By moving the \s+ out side of the repeat like this the expression is now
much more deterministic - it can only do one thing for any given input
character.
> and one question:
> having "DATA.*?ITEM1(ITEM2)?" and an input like "DATA ITEM1 ITEM1ITEM2"
> should ITEM2 be extracted?
> i think it would be good to make a note on this case in the doc.
No for Perl regexes, not sure for POSIX regexes (non-greedy repeats don't
sit will with POSIX semantics in cases like this, I'd advise using Perl
regexes only with non-greedy repeats).
John.
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net