Boost logo

Boost :

From: John Maddock (john_maddock_at_[hidden])
Date: 2002-06-08 06:23:44


> I have asked this back in May 3, 2002 but got no
> answer :( -
> http://lists.boost.org/MailArchives/boost/msg30181.php
>
> My question is simple - how do I use unicode
> characters in regex expressions? Do I just input them
> as is, or how?

Sorry I missed that message somehow:

>So far I am confused about:
>1. Syntax of unicode characters in regular expression
> a. do I put the character as is, retaining regular
>syntax

Yes, wide character strings are handled using the same syntax as narrow
character strings.

> b. can I refer to it using a sepcial syntax? \xfeff
>? something
> like that?

Yes, use \x{feff}, see the regular expression syntax section of the docs.

>2. cmatch .. when you have something like:
>cmatch what;
>str = "some string"
>regex express("(\\S\\S)([Ayw])(\\S)");
>if (regex_match(str, what, express))
> cout << what[0].first << endl;
>else
> cout << "No match!\n";
>
>Even after reading the documentation on the second
>argument to regex_match, I didn't quite understand it.

It's basically an array of pairs of iterators denoting what matched - see
the match_results documentation. And you'll be using wcmatch or wsmatch for
wide character matches.

John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/index.htm


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk