Re: [Boost-bugs] [Boost C++ Libraries] #360: Regex

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #360: Regex
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2007-08-13 18:25:38


#360: Regex
-------------------------------+--------------------------------------------
  Reporter: nobody | Owner: johnmaddock
      Type: Support Requests | Status: closed
 Milestone: | Component: regex
   Version: None | Severity: Problem
Resolution: invalid | Keywords:
-------------------------------+--------------------------------------------
Changes (by johnmaddock):

  * status: assigned => closed
  * resolution: None => invalid

Old description:

> {{{
> PROBLEM:
> ========
>
> If I use the following code:
>
> lv_rcode = regcomp(&lr_re, ms_RegExp,
> REG_EXTENDED);
>
> lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);
>
> to find the following text:
>
> cm582172
>
> Using the following regular expression
>
> [A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
>
> The text if found.
>
> However if I use instead the following code
>
> boost::regex regx(config.vRegx.at(iCnt).c_str());
> flags = boost::match_default;
> boost::regex_search(start, end, what, regx, flags)
>
> With the same regular expression it does not find the
> string unless I use the following regular expression:
>
> [a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
>
> The difference is the lower case of the alpha range [a-z]
>
> My question:
>
> Is there a way I can use regexec to be case sensitive for
> the alpha range like it is with regex_search function?
>
> Thanks,
>
> Matias
> }}}

New description:

 {{{
 PROBLEM:
 ========

 If I use the following code:

 lv_rcode = regcomp(&lr_re, ms_RegExp,
 REG_EXTENDED);

 lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);

 to find the following text:

 cm582172

 Using the following regular expression

 [A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*

 The text if found.

 However if I use instead the following code

 boost::regex regx(config.vRegx.at(iCnt).c_str());
 flags = boost::match_default;
 boost::regex_search(start, end, what, regx, flags)

 With the same regular expression it does not find the
 string unless I use the following regular expression:

 [a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*

 The difference is the lower case of the alpha range [a-z]

 My question:

 Is there a way I can use regexec to be case sensitive for
 the alpha range like it is with regex_search function?

 Thanks,

 Matias
 }}}

Comment:

 This occurs because by default POSIX regular expressions treat character
 ranges like [A-Z] as locale sensitive, and will match any character that
 collates within that range. For most locales the character "cm" do
 collate within that range, and hence they match. You can get more Perl-
 like behaviour by setting REG_NOCOLLATE as well as REG_EXTENDED when
 compiling the expression.

--
Ticket URL: <http://svn.boost.org/trac/boost/ticket/360#comment:2>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.


This archive was generated by hypermail 2.1.7 : 2017-02-16 18:49:56 UTC