Boost logo

Boost :

From: John Maddock (john_at_[hidden])
Date: 2003-12-10 07:07:19


> Hello, I am trying to design an application which uses
> Regexp++ to match strings that are in different languages.
> I have started with some basic non-Unicode character
> sets, but am not able to properly match.
>
> This is on win32, but should I be using the setlocale() C
> interface if I want to support multiple languages during
> runtime?

By default on Win32, there is a single locale in effect - the users default
Win32 locale.

If you want to support multiple locales then you can either:

1) Use boost::reg_expression<char, cpp_regex_traits<char> > rather than
boost::regex, you can then imbue your expressions with a C++ std::locale
object:

typedef boost::reg_expression<char, cpp_regex_traits<char> > regex_type;

regex_type r1;
regex_type r2;
r1.imbue(std::locale("en_GB"));
r1.assign("\\w+"); // UK English
r1.imbue(std::locale("fr_FR"));
r1.assign("\\w+"); // French

2) define BOOST_REGEX_USE_CPP_LOCALE in boost/regex/user.hpp, and rebuild
everything. cpp_regex_traits is now the default traits class, so:

boost::regex r1;
boost::regex r2;
r1.imbue(std::locale("en_GB"));
r1.assign("\\w+"); // UK English
r1.imbue(std::locale("fr_FR"));
r1.assign("\\w+"); // French

will work.

BTW all of the above assumes that you have a decent implementation of
std::locale that actually support something other than the "C" locale!

John.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk