Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2007-06-01 13:36:50

It's a poorly kept secret that I've been adding new features to
Xpressive on CVS HEAD for a while now. So here are the features already
implemented (documentation forthcoming) for Xpressive 2.0:

<< Semantic Actions >>

Specify code to execute when parts of a regex match, a-la Spirit's
semantic actions. Eg.: if you want to parse a string of name/value pairs
into a std::map, you might:

     std::map<std::string, int> result;
     std::string str("aaa=>1 bbb=>23 ccc=>456");

     // Like "(\\w+)=>(\\d+)":
     sregex pair = ( (s1= +_w) >> "=>" >> (s2= +_d) )
                   [ ref(result)[s1] = as<int>(s2) ];
     sregex rx = pair >> *(+_s >> pair);

     if(regex_match(str, rx))
         assert(result["aaa"] == 1);
         assert(result["bbb"] == 23);
         assert(result["ccc"] == 456);

The actions are placed on a queue and executed in order only when the
regex match succeeds.

<< Custom Assertions >>

Use the check() function to create a boolean predicate that can
participate in the match. Here's a regex that recognizes two integers
only if the first is less than the second:

   sregex rx = ( (s1= +_d) >> ' ' >> (s2= +_d) )
               [ check( as<int>(s1) < as<int>(s2) ) ];

Unlike actions, predicates execute immediately. You can also define the
predicate out-of-line as a function object.

<< Dynamic Regex Grammars with Named Regexes >>

Using regex_compiler, you can map a name to a regex object, and then
refer to that regex from another by name. In this way, you can build
grammars from regexes at runtime.

     sregex_compiler comp;
     sregex rx = comp.compile("^bar(?$RE)baz$");
     comp.compile("(?$RE=)\\d+ \\d+");

There's an alternate syntax for associating a name with a regex that you
can use to nest a static regex in a dynamic one. Eg., the last line
above could be:

     comp["RE"] = +_d >> ' ' >> +_d;

With these changes, you can now nest static and dynamic regexes within
each other freely, giving you lots of flexibility to build grammars and
modify them on the fly.

<< Named Captures >>

For dynamic regular expressions, you can create a named capture with
(?P<name> ...). You can refer back to the named capture with (?P=name).
In substitution strings (for use with regex_replace()), you can refer
back to a named capture with \\g<name> when using the format_perl or
format_all flags.

And more .... I'd like to give props to Dave Jenkins who has been doing
m4d things with this stuff already, and who has given me a laundry list
of other features he'd like.

The new features are only available in CVS HEAD, for now. Feedback is

Eric Niebler
Boost Consulting

Boost list run by bdawes at, gregod at, cpdaniel at, john at