Boost logo

Boost :

From: John Maddock (john_maddock_at_[hidden])
Date: 2002-06-08 06:23:44

> I have asked this back in May 3, 2002 but got no
> answer :( -
> My question is simple - how do I use unicode
> characters in regex expressions? Do I just input them
> as is, or how?

Sorry I missed that message somehow:

>So far I am confused about:
>1. Syntax of unicode characters in regular expression
> a. do I put the character as is, retaining regular

Yes, wide character strings are handled using the same syntax as narrow
character strings.

> b. can I refer to it using a sepcial syntax? \xfeff
>? something
> like that?

Yes, use \x{feff}, see the regular expression syntax section of the docs.

>2. cmatch .. when you have something like:
>cmatch what;
>str = "some string"
>regex express("(\\S\\S)([Ayw])(\\S)");
>if (regex_match(str, what, express))
> cout << what[0].first << endl;
> cout << "No match!\n";
>Even after reading the documentation on the second
>argument to regex_match, I didn't quite understand it.

It's basically an array of pairs of iterators denoting what matched - see
the match_results documentation. And you'll be using wcmatch or wsmatch for
wide character matches.

John Maddock

Boost list run by bdawes at, gregod at, cpdaniel at, john at