Boost :

Date view	Thread view	Subject view	Author view

From: John Maddock (john_maddock_at_[hidden])
Date: 2002-06-08 06:23:44

Next message: David Abrahams: "[boost] New archive search"
Previous message: John Maddock: "Re: [boost] regex partial match bug"
In reply to: Tha Project: "[boost] Unicode"

> I have asked this back in May 3, 2002 but got no
> answer :( -
> http://lists.boost.org/MailArchives/boost/msg30181.php
>
> My question is simple - how do I use unicode
> characters in regex expressions? Do I just input them
> as is, or how?

Sorry I missed that message somehow:

>So far I am confused about:
>1. Syntax of unicode characters in regular expression
> a. do I put the character as is, retaining regular
>syntax

Yes, wide character strings are handled using the same syntax as narrow
character strings.

> b. can I refer to it using a sepcial syntax? \xfeff
>? something
> like that?

Yes, use \x{feff}, see the regular expression syntax section of the docs.

>2. cmatch .. when you have something like:
>cmatch what;
>str = "some string"
>regex express("(\\S\\S)([Ayw])(\\S)");
>if (regex_match(str, what, express))
> cout << what[0].first << endl;
>else
> cout << "No match!\n";
>
>Even after reading the documentation on the second
>argument to regex_match, I didn't quite understand it.

It's basically an array of pairs of iterators denoting what matched - see
the match_results documentation. And you'll be using wcmatch or wsmatch for
wide character matches.

John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/index.htm

Next message: David Abrahams: "[boost] New archive search"
Previous message: John Maddock: "Re: [boost] regex partial match bug"
In reply to: Tha Project: "[boost] Unicode"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk