|
Boost : |
From: John Maddock (john_at_[hidden])
Date: 2004-05-14 05:12:17
> I was just bitten by an extremely nasty bug, caused in part by my own
> foolishness, but also by a subtlety of Boost Regex that probably should be
> documented explicitly.
>
> http://boost.org/libs/regex/doc/format_syntax.html gives a list of "The
> following Perl like expressions" and says that $N "Expands to the text
that
> matched sub-expression N."
>
> HOWEVER, Boost Regex is sufficiently general that $10 will match the tenth
> subexpression, $25 the twenty fifth, and so forth. Perl is not that
general
> - it recognizes only $1 through $9 here.
>
> I believed that Boost behaved like Perl. I concatenated "$1" with a
random
> hexadecimal string, occasionally producing "$10", "$15", etc. before the
> letter hexits appeared. This led to many hours of misery.
>
> The documentation should note Boost Regex's generality. The generality is
a
> good thing - $1 can be separated from later digits with empty ()
parentheses
> - but it needs to be made clear.
Will do.
> Does this also affect the regex standardization proposal? It'd be bad if
> some implementers thought they only had to support $1 - $9.
Yep, the proposal uses the ECMA standard by reference - and that allows for
$n or $nn to be recognised as refering to sub-expressions, so you access $99
but $100 is really ${10}0.
John.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk