Boost logo

Boost Users :

From: Derrick Schommer (schommer_at_[hidden])
Date: 2007-02-15 17:50:27


Hi,

I'm trying to understand if a regular expression I've designed is
working correctly. It seems that I can make the regular expression
match work "off by one" character in certain circumstances. For
instance:

The pattern: ".{1,3}" finds 1 to 3 characters, however

The pattern: ".[^b]{1,3}" finds 1 to 4 characters...?

For some reason when I apply a "do not allow the character b" or any
other bracket operator the {min,max} range value seems to allow for
one additional character. I cannot imagine this being done on purpose.

The work around is to allow one less then you really want (i.e. {1,2})
but I figured I'd post something and see if anyone has seen it or
could explain what is happening. Maybe its just a bug.

The code example:

   boost::wregex pattern( TEXT( ".{1,3}" ) );
    boost::wcmatch matchResults;

    if( boost::regex_match( TEXT( "asd" ), matchResults, pattern ) ) {
      // Match
   } else {
     // No match
    }

  } catch( runtime_error &e ) {
    printf( "\n%%FAILURE: Regular expression error: %s.\n", e.what() );
  }

Yes, I'm running in a UTF-16 build environment in Microsoft Windows
.NET using C++.

If this isn't the correct venue for this message, sorry in advance.

Thanks,

Derrick


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net