|
Boost Users : |
Subject: Re: [Boost-users] [Regex] Is it possible to match any "linebreak"
From: Christoph Duelli (duelli_at_[hidden])
Date: 2012-03-22 06:44:35
John Maddock wrote:
>> Is there a way to have a regex like
>> [[:digit:]]{3}([^\n]+)\n?
>> have \n match any line breaking character?
>> (The processed files might have Unix or DOS line endings.)
>
> Use \R, see:
>
http://www.boost.org/doc/libs/1_49_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.matching_line_endings
>
> HTH, John.
Thank you John.
Somehow I must have overlooked \R in the docs.
However, I still have one issue with \R.
In the following code everything works fine if I use r2.
If r1 the matches' captures do contain the newlines.
In short: ([^\\R+]) does not seem to capture all non-linebreak characters.
Should it? Or is this just a misunderstanding on my part?
#include <boost/test/auto_unit_test.hpp>
#include <boost/test/test_tools.hpp>
#include <boost/regex.hpp>
BOOST_AUTO_TEST_CASE(test_boost_regexp)
{
// works fine
boost::regex r1("\\d\\d\\d([^\\R]+)\\R*");
// boost::regex r2("\\d\\d\\d([^\\r\\n]+)\\R*");
boost::smatch what;
std::string input("123hallo welt\n\r");
BOOST_CHECK(boost::regex_match(input, what, r1));
// ok with r1, but fails with r2: the capture does contain the newline
BOOST_CHECK_EQUAL(what[1], "hallo welt");
input="123hallo welt\n";
BOOST_CHECK(boost::regex_match(input, what, r1));
BOOST_CHECK_EQUAL(what[1], "hallo welt");
input="123hallo welt\r";
BOOST_CHECK(boost::regex_match(input, what, r1));
BOOST_CHECK_EQUAL(what[1], "hallo welt");
}
Best regards
Christoph
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net