Boost logo

Boost Users :

From: Simon Steele (s.steele_at_[hidden])
Date: 2008-07-30 17:12:53


Hi,

I think there's a bug in Xpressive in Boost 1.35 with newline
handling. Here is a very simple example:

Subject: abc\ndef\nghi
Regex: ^.+$

I'm looking for "abc". The following xpressive-based code has odd results:

string text(argv[1]);
sregex rex = sregex::compile(text, regex_constants::not_dot_newline);

smatch match;
string subject("abc\ndef\nghi");
if (regex_search(subject.begin(), subject.end(), match, rex,
regex_constants::match_default))
{
    std::cout << "Match at: " << match.position(0) << " length: " <<
match.length(0);
}

^.+$: Match at: 8 length: 3
$: Match at: 4 length: 0
.$: Match at: 10 length: 1

These definitely aren't right, it appears that the only place where
end-of-line is matching is at end-of-string. Things get even stranger
if I try to use Windows or Mac style end-of-line characters which I
need to support too. If I use boost::regex instead then I get these
results:

^.+$: Match at: 0 length: 3
$: Match at: 3 length: 0
.$: Match at: 2 length: 1

These are the results I'd expect. It appears in Xpressive that the
check in assert_line_base::is_line_break is broken, the following
line:

if(traits_cast<Traits>(state).isctype(ch, this->newline_))

checks the previous character to see if it's a line break. I believe
this should check the current character:

if(traits_cast<Traits>(state).isctype(*state.cur_, this->newline_))

This then gets me sane results:

^.+$: Match at: 0 length: 3
$: Match at: 3 length: 0
.$: Match at: 2 length: 1

Is this a known bug?

Thanks,

Simon.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net