Subject: Re: [Boost-bugs] [Boost C++ Libraries] #360: Regex
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2007-08-13 18:25:38
#360: Regex
-------------------------------+--------------------------------------------
Reporter: nobody | Owner: johnmaddock
Type: Support Requests | Status: closed
Milestone: | Component: regex
Version: None | Severity: Problem
Resolution: invalid | Keywords:
-------------------------------+--------------------------------------------
Changes (by johnmaddock):
* status: assigned => closed
* resolution: None => invalid
Old description:
> {{{
> PROBLEM:
> ========
>
> If I use the following code:
>
> lv_rcode = regcomp(&lr_re, ms_RegExp,
> REG_EXTENDED);
>
> lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);
>
> to find the following text:
>
> cm582172
>
> Using the following regular expression
>
> [A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
>
> The text if found.
>
> However if I use instead the following code
>
> boost::regex regx(config.vRegx.at(iCnt).c_str());
> flags = boost::match_default;
> boost::regex_search(start, end, what, regx, flags)
>
> With the same regular expression it does not find the
> string unless I use the following regular expression:
>
> [a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
>
> The difference is the lower case of the alpha range [a-z]
>
> My question:
>
> Is there a way I can use regexec to be case sensitive for
> the alpha range like it is with regex_search function?
>
> Thanks,
>
> Matias
> }}}
New description:
{{{
PROBLEM:
========
If I use the following code:
lv_rcode = regcomp(&lr_re, ms_RegExp,
REG_EXTENDED);
lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);
to find the following text:
cm582172
Using the following regular expression
[A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
The text if found.
However if I use instead the following code
boost::regex regx(config.vRegx.at(iCnt).c_str());
flags = boost::match_default;
boost::regex_search(start, end, what, regx, flags)
With the same regular expression it does not find the
string unless I use the following regular expression:
[a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
The difference is the lower case of the alpha range [a-z]
My question:
Is there a way I can use regexec to be case sensitive for
the alpha range like it is with regex_search function?
Thanks,
Matias
}}}
Comment:
This occurs because by default POSIX regular expressions treat character
ranges like [A-Z] as locale sensitive, and will match any character that
collates within that range. For most locales the character "cm" do
collate within that range, and hence they match. You can get more Perl-
like behaviour by setting REG_NOCOLLATE as well as REG_EXTENDED when
compiling the expression.
--
Ticket URL: <http://svn.boost.org/trac/boost/ticket/360#comment:2>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:49:56 UTC