From: Reece Dunn (msclrhd_at_[hidden])
Date: 2005-09-19 02:36:20
Vladimir Prus wrote:
> Adam Badura wrote:
>> I looked on a few GUI C++ libraries, but none of them satisfied me.
>>commonly of this reasons:
>>1) weak support if at all for exceptions (wxWidgets for example)
>>2) using own classes instead of standard (most common for string)
>>(wxWidgets and MFC for example)
> Well, given that std::basic_string's support for Unicode is lacking, I would
> consider using custom string class an advantage of those libraries. Hey,
> you can't even convert std::string to std::wstring, and you can't construct
> std::wstring from char*. Can you convert std::wstring to any of Unicode's
> normalization forms? How portable is reading of std::wstring from a file
> with specific 8-bit encoding?
WRT the MFC/ATL/WTL libraries, they use the Win32 API calls
WideCharToMultiByte (WC2MB) and MultiByteToWideChar (MB2WC) to do the
conversion using the current thread's codepage. Likewise, their
CA2W/CW2A helper classes do a similar thing.
The WC2MB/MB2WC API allow you to pass in a specific codepage (not just
the current thread/user's). Some of these include:
UTF7 = 65000
UTF8 = 65001
UTF16 (Little Endian) = 1200
UTF16 (Big Endian) = 1200
So, you could say something like:
std::cout << unicode_cast< std::string >( russian_text, unicode::utf8 );
where, on windows, unicode_cast uses WC2MB/MB2WC and unicode::utf8 is
the UTF8 codepage (65001).
Going the other way, reading std::wstring from a file... you can detect
UTF8/16/32 (LE and BE) by having a Byte Order Mark (BOM) at the start of
the file (defined at www.unicode.org) -- this is what is done in
Windows. Then you can say:
0xFE 0xFF -- unicode::utf16be;
0xFF 0xFE -- unicode::utf16le;
0xEF 0xBB 0xBF -- unicode::utf8;
Then you could have something like:
operator>>( std::basic_istream< char > * is, std::wstring & str )
is >> s; // read in a string in its native (raw) form
str = unicode_cast< std::wstring >( s, is.unicode_format());
where *stream::unicode_format() returns the identified unicode form, or
some implementation-specific default value.
I am not sure about how this would work for Linux, Mac and other
operating systems, though.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk