Boost Users :

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-users] something about UTF8
From: Rune Lund Olesen (rune.olesen_at_[hidden])
Date: 2008-12-18 05:27:21

Next message: Birju Prajapati: "[Boost-users] Thread scope"
Previous message: Georg Sauthoff: "[Boost-users] [test framework] one testsuite over multiple translation units"
In reply to: John Maddock: "Re: [Boost-users] something about UTF8"

Working with wstrings with the regex lib should work without problems,
except you cannot rely on unicode specific character classes. Just make sure
you convert correctly between UTF-8 and wide-char strings.

Rune

On Thu, Dec 18, 2008 at 10:39 AM, John Maddock <john_at_[hidden]>wrote:

> wind world wrote:
>
>> hi guys,
>>> I want to use boost::regex in Windows XP to match Japanese kanji.
>>> The encoding of kanji is UTF-8 I want to make sure after I use the
>>> funcation: MultibyteToWideChar to change the UTF-8 Kanji
>>> string->wstring, I can directly use boost::wregex(from wstring) to
>>> match Japanese?
>>>
>>
> You would need to check the Windows API docs to make sure you're using the
> API correctly (does it work with UTF-8 as source? No idea on that), but
> yes, once you have the text encoded as UTF-16 then wregex will behave as you
> expect.
>
> Otherwise you could build regex with ICU support and then match UTF-8
> directly: the downside is that you then have a dependency to ICU which is
> not a small library.
>
> HTH, John.
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>

text/html attachment: attachment

Next message: Birju Prajapati: "[Boost-users] Thread scope"
Previous message: Georg Sauthoff: "[Boost-users] [test framework] one testsuite over multiple translation units"
In reply to: John Maddock: "Re: [Boost-users] something about UTF8"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net