Boost logo

Boost :

Subject: [boost] Named subexpressions using \k<name>
From: Chris Dragon (cdragon_at_[hidden])
Date: 2010-11-04 00:51:43


I have a user who says they have a bunch of expressions written for the
Oniguruma regex library that use \k<name> syntax. According to
http://www.boost.org/doc/libs/1_44_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
Boost.regex should also support that syntax, but it does not seem to work.

For example:
boost::regex_replace(
    std::string("xyyz"),
    boost::regex(".*(?<first>x)(?<second>y).*",
boost::regex_constants::perl),
    std::string("$+{second}\\k<first>"),
    boost::match_default | boost::format_all | boost::match_extra);

results in "yk<first>"

This result shows that $+{second} evaluated to "y" but \k<first>
evaluated to "k<first>".

I thought maybe the \k<name> syntax was only supported as a
backreference within the first regex instead of the replace string, but
this doesn't work either:
boost::regex_replace(
    std::string("xyyz"),
    boost::regex(".*(?<first>x)(?<second>y)\\k<second>.*",
boost::regex_constants::perl),
    std::string("$+{second}"),
    boost::match_default | boost::format_all | boost::match_extra);

The result is "xyyz" meaning nothing was replaced (the regex did not match).

However, if modify the regex to use \\g{name} syntax, it works:
boost::regex_replace(
    std::string("xyyz"),
    boost::regex(".*(?<first>x)(?<second>y)\\g{second}.*",
boost::regex_constants::perl),
    std::string("$+{second}"),
    boost::match_default | boost::format_all | boost::match_extra);

The result is "y".

Am I doing something wrong or have I found a bug? I'm using the latest
stable 1_44_0 release compiled from source in VC9 and I also tried the
pre-compiled library
(http://sourceforge.net/projects/boost/files/boost-binaries/1.44.0/libboost_regex-vc90-mt-sgd-1_44.zip/download).
The OS is Windows 7.

Thanks for the help!


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk