Boost logo

Boost Users :

From: Eric Niebler (eric_at_[hidden])
Date: 2006-06-30 16:30:42


Olenhouse, Jason wrote:
> I've read the boost regex docs and I don't see anything about named captures. I see information about
> captures using the number notation, $1, $2, etc. I'm assuming that the regex lib doesn't support
> named captures, correct?
>
> I looked at the spirit docs and didn't see any mention of it in the regex parser portions. I'm
> guessing that spirit doesn't support it either, correct? And by not supporting it I mean an
> out-of-the-box solution with the regex parser. I'm sure something more complicated than I could
> understand is possible with spirit. I can post to that mailing list if no one here knows.
>
> I'm trying to regex syslog messages for specific pieces of data. Depending on what systems send me
> syslog messages the order of the pieces of data may be different. I'd like a solution where I can ask
> for the specific piece of data explicitly, instead of having to remember that on system A dataItem1 is
> $1, but on system B it's $2. I will always have the same fields that I look for and that won't
> change. I'm trying to not complicate my database schema or the make the client's use any more
> difficult since they're the ones that write the regex statement. If named captures with the regex lib
> or with spirit aren't supported, any ideas on how this might be done would be appreciated. I'm stuck
> on boost 1.32, but have been looking for a good excuse to force our development dept. to a newer
> release.

When the next version of Boost comes out (it's in RC now), there will be
a new boost library called 'xpressive' that can do that. You can think
of it as a hybrid between regex and Spirit. You can write your regex as
an xpression template:

     // define some custom mark_tags with meaningful names
     mark_tag day(1), month(2), year(3), delim(4);

     // this regex finds a date, perl equivalent would be:
     // (\d{1,2})([/-])(\d{1,2})\2((?:\d\d){1,2}))
     sregex date = (month= repeat<1,2>(_d)) // find the month
>> (delim= (set= '/','-')) // followed by a delimiter
>> (day= repeat<1,2>(_d)) // and a day
>> delim // and the delimiter again
>> (year= repeat<1,2>(_d >> _d)); // and the year.

     smatch what;

     if( regex_search( str, what, date ) )
     {
         std::cout << what[0] << '\n'; // whole match
         std::cout << what[day] << '\n'; // the day
         std::cout << what[month] << '\n'; // the month
         std::cout << what[year] << '\n'; // the year
         std::cout << what[delim] << '\n'; // the delimiter
     }

It doesn't matter what order the mark tags show up in your pattern, or
what integer values they are initialized with, so long as they are unique.

xpressive is in Boost CVS now, and also available as a separate download
from the boost sandbox (http://tinyurl.com/8fean). It works with Boost
1.32, so you won't need to upgrade. Docs are available here:
http://tinyurl.com/48kv5.

HTH,

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net