|
Boost : |
Subject: [boost] [xpressive] Support for multi-capture and balancing groups
From: Erik Rydgren (erik_at_[hidden])
Date: 2010-10-02 20:52:23
Hi!
My company have been using pcre for a long while but there has been gripes
about it only returning the last
matched value from a capture group. Because of this I have been searching
for a C++ regex engine that can
handle the same stuff as the .NET implementation can do, to no avail. During
my searches I've stumbled on
several forum threads where others have been searching for the same thing
but it doesn't seem to exist
a regular expression library in C/C++ that handles both named captures and
multicapture.
Found boost.xpressive and it had almost everything we need. It's open
source, fast, got flexible api and
named captures. But alas, just as all other C and C++ based implementations
I have found, it lacked multiple captures.
So, I added it. On top of that I added support for balancing groups
(http://blog.stevenlevithan.com/archives/balancing-groups).
But the syntax for the pop capture and capture conditional is slightly
different then the .NET version to better fit xpressive.
Syntax for pop capture:
dynamic: (?P<-name>stuff)
static: (name -= stuff)
Syntax for capture conditional:
dynamic: (?P(name)stuff)
static: (name &= stuff)
There is no support for the (?<name-othername>stuff) construct.
All captures made by a group is stored in sub_match::captures which is a
vector of sub_match_capture objects.
A sub_match_capture behaves like a stripped down sub_match. It can be put in
an ostream and has a length and
helper function for returning a string.
The changes are in the vault and can be found here:
http://tinyurl.com/3aak7mp
It can be unpacked against trunk from 2010-10-02 or the 1.44.0 release. I've
run the dynamic
regression tests without errors and I have added some tests for the new
functionality.
The code it only tested on Visual Studio 2010 since I don't have access to
any other compiler.
Please give feedback on my changes since I would love to see them in an
official release.
Thanks in advance.
Regards,
Erik
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk