|
Boost : |
From: Eric Niebler (eric_at_[hidden])
Date: 2007-06-01 13:36:50
It's a poorly kept secret that I've been adding new features to
Xpressive on CVS HEAD for a while now. So here are the features already
implemented (documentation forthcoming) for Xpressive 2.0:
<< Semantic Actions >>
Specify code to execute when parts of a regex match, a-la Spirit's
semantic actions. Eg.: if you want to parse a string of name/value pairs
into a std::map, you might:
std::map<std::string, int> result;
std::string str("aaa=>1 bbb=>23 ccc=>456");
// Like "(\\w+)=>(\\d+)":
sregex pair = ( (s1= +_w) >> "=>" >> (s2= +_d) )
[ ref(result)[s1] = as<int>(s2) ];
sregex rx = pair >> *(+_s >> pair);
if(regex_match(str, rx))
{
assert(result["aaa"] == 1);
assert(result["bbb"] == 23);
assert(result["ccc"] == 456);
}
The actions are placed on a queue and executed in order only when the
regex match succeeds.
<< Custom Assertions >>
Use the check() function to create a boolean predicate that can
participate in the match. Here's a regex that recognizes two integers
only if the first is less than the second:
sregex rx = ( (s1= +_d) >> ' ' >> (s2= +_d) )
[ check( as<int>(s1) < as<int>(s2) ) ];
Unlike actions, predicates execute immediately. You can also define the
predicate out-of-line as a function object.
<< Dynamic Regex Grammars with Named Regexes >>
Using regex_compiler, you can map a name to a regex object, and then
refer to that regex from another by name. In this way, you can build
grammars from regexes at runtime.
sregex_compiler comp;
sregex rx = comp.compile("^bar(?$RE)baz$");
comp.compile("(?$RE=)\\d+ \\d+");
There's an alternate syntax for associating a name with a regex that you
can use to nest a static regex in a dynamic one. Eg., the last line
above could be:
comp["RE"] = +_d >> ' ' >> +_d;
With these changes, you can now nest static and dynamic regexes within
each other freely, giving you lots of flexibility to build grammars and
modify them on the fly.
<< Named Captures >>
For dynamic regular expressions, you can create a named capture with
(?P<name> ...). You can refer back to the named capture with (?P=name).
In substitution strings (for use with regex_replace()), you can refer
back to a named capture with \\g<name> when using the format_perl or
format_all flags.
And more .... I'd like to give props to Dave Jenkins who has been doing
m4d things with this stuff already, and who has given me a laundry list
of other features he'd like.
The new features are only available in CVS HEAD, for now. Feedback is
welcome.
-- Eric Niebler Boost Consulting www.boost-consulting.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk