Boost logo

Boost :

From: Christelle Piedsnoirs (chris_sidesprion_at_[hidden])
Date: 2001-04-04 04:28:05


Hello,

I have just installed the library Regex++ of which
version is 3.03 (everything was ok). I have made some
simple programs to test this library and I encounter a
problem with using of parenthesis in an regular
expression.

The purpose of the following program is to suppress
the needless zeros into an alphanumeric string.

#include <iostream>
#include <string>
#include <boost/regex.hpp>

boost::regex pattern ;

extern const char* expression_pattern ;
extern const std::string format("\\1\\3") ;

const char* expression_pattern =
"([[:alpha:]]*)(0*)(\\d*)" ;

int main(int argc, const char** argv)
{
  pattern.set_expression(expression_pattern) ;

  std::cout << "----- String parser -----" <<
std::endl;
  for(int i = 1; i < argc; ++i)
  {
    boost::cmatch what ;

    if (boost::regex_match(argv[i], pattern))
    {
      std::string myString(argv[i]) ;
      std::cout << "String " << argv[i] << " is Ok !"
<< std::endl ;

      // To format input string
      std::string myFormatedString =
boost::regex_merge(myString, pattern, format,
boost::format_sed) ;
      std::cout << "Formated string : ";
      std::cout.write(myFormatedString.c_str(),
myFormatedString.length());
      std::cout << std::endl ;
    }
    else
      std::cout << "String " << argv[i] << " is
refused ! " << std::endl;
  }
  return 0 ;
}

To run executable program with string
"ABCD000124567890" as parameter gets following result
:
p:\>D:\Dev\TestRegExp\Debug\TestRegExp.exe
ABCD000124567890
----- String parser -----
String ABCD000124567890 is Ok !
Formated string : ABCD124567890

Everything is OK !

Now, if I change in my program the two following lines
:
extern const std::string format("\\1\\3") ;
const char* expression_pattern =
"([[:alpha:]]*)(0*)(\\d*)" ;
by following ones:
extern const std::string format("\\1\\2") ;
const char* expression_pattern =
"([[:alpha:]]*)0*(\\d*)" ;

I get following result:
p:\>D:\Dev\TestRegExp\Debug\TestRegExp.exe
ABCD000124567890
----- String parser -----
String ABCD000124567890 is Ok !
Formated string : ABCD000124567890

Both results differ whereas they should be the same
(Formated string : ABCD124567890). Why ?
I have searched in online documentation and archive of
this mailing list, but I haven't found nothing about
this problem.

In the regular expression ([A-Za-z]*)(0*)([0-9]*), the
parenthesis are used only to mark what generated the
match and in this case, the result of the match should
be the same one as that obtained with the regular
expression ([A-Za-z]*)0*([0-9]*).
Why does using of parenthesis into a regular
expression change result of the match ?

Thanks for your responses.
Sorry for my english.

Chris

___________________________________________________________
Do You Yahoo!? -- Pour dialoguer en direct avec vos amis,
Yahoo! Messenger : http://fr.messenger.yahoo.com


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk