|
Boost Users : |
Subject: Re: [Boost-users] UTF16
From: Tan, Tom (Shanghai) (TTan_at_[hidden])
Date: 2009-07-17 05:40:28
I have some code that does conversion between UTF16 and MBCSs on Windows
only:
template<typename FROM, typename TO>
struct convert
{
basic_string<TO> operator()(const
basic_string<FROM>& from)
{
return from;
}
};
template<>
struct convert<wchar_t, char>
{
string operator()(const wstring& from)
{
return utf16_to_mbcs(from);
}
private:
string utf16_to_mbcs(const wstring& ws)
{
if(ws.empty()) return string();
const size_t BUFFER_SIZE =
(ws.size() << 1) + 1;
shared_array<char> p_mcb(new
char[BUFFER_SIZE]);
bool has_utf16le_bom = (0xFEFF
== ws[0]);
int count =
::WideCharToMultiByte(
AreFileApisANSI() ? CP_THREAD_ACP : CP_OEMCP,
WC_NO_BEST_FIT_CHARS,
( has_utf16le_bom ?
ws.substr(1) : ws).c_str(),
has_utf16le_bom ?
ws.size() - 1 : ws.size(),
p_mcb.get(),
BUFFER_SIZE,
0,
0 );
return (0 == count)
? string()
: string(p_mcb.get(), count );
}
};
template<>
struct convert<char, wchar_t>
{
wstring operator()(const string& from)
{
return mbcs_to_utf16(from);
}
private:
wstring mbcs_to_utf16(const string& s)
{
if(s.empty()) return wstring();
const size_t BUFFER_SIZE =
(s.size() << 1) + 1;
shared_array<wchar_t> p_ws(new
wchar_t[BUFFER_SIZE]);
int count =
::MultiByteToWideChar(
AreFileApisANSI() ? CP_THREAD_ACP : CP_OEMCP,
MB_PRECOMPOSED,
s.c_str(),
s.size(),
p_ws.get(),
BUFFER_SIZE
);
return (0 == count)
? wstring()
: wstring(p_ws.get(),
count );
}
};
>Date: Thu, 16 Jul 2009 10:39:31 +0200
>From: plarroy <plarroy_at_[hidden]>
>
>My approach is using std::string, etc. all the time and using UTF-8
>internally, only converting to other charsets when it's needed.
>I use IBM icu library and made a boost::iostreams filter to convert
>encoding, once it's done takes a lot of complexity away, I use it like:
> // setup a conversion from charset to utf-8
> filt_streamb.push(ucnv_filter(charset.c_str(), "utf-8"));
> istream is(&filt_streamb);
>Perhaps there's interest to push this charset conversion into
>boost::iostreams filters examples.
>Regards.
Robert Dailey wrote:
> Oh, I also forgot to mention, I am also using boost::filesystem::path.
I
> guess this means I need to use wchar_t everywhere (std::wstring,
> boost::filesystem::wpath, etc) and just let wxWidgets do the
> encoding/decoding? If I don't have to do any encoding/decoding myself,
then
> there really is no need for a special object. But just in case I would
like
> to have the encoding/decoding abilities.
>
> On Sun, Jun 14, 2009 at 12:27 PM, Robert Dailey <rcdailey_at_[hidden]>
wrote:
>
>
>> Hi everyone,
>> I did a bit of googling to see if Boost 1.39 as any portable support
for
>> UTF-16 encoded strings, but I did not find any. I'm currently using
>> wxWidgets in my application, and I need a decent string object to
use. I
>> know that wxWidgets has UTF-16 string support through wxString,
however I do
>> not want to expose this object in my interfaces. I want to remain as
>> abstracted away from wxWidgets as possible. Having said that, if
someone
>> could tell me if there is any existing UTF-16 string support in
Boost, I'd
>> appreciate it. I did not find anything in the vault, sandbox, or
trunk in
>> Boost.
>>
>> If boost has no such string object, could someone give me a head
start on
>> where to look? Thanks.
>>
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net