Boost logo

Boost Users :

Subject: Re: [Boost-users] UTF-16
From: Zachary Turner (divisortheory_at_[hidden])
Date: 2009-06-14 14:08:49


On Sun, Jun 14, 2009 at 12:32 PM, Robert Dailey<rcdailey_at_[hidden]> wrote:
> Oh, I also forgot to mention, I am also using boost::filesystem::path. I
> guess this means I need to use wchar_t everywhere (std::wstring,
> boost::filesystem::wpath, etc) and just let wxWidgets do the
> encoding/decoding? If I don't have to do any encoding/decoding myself, then
> there really is no need for a special object. But just in case I would like
> to have the encoding/decoding abilities.
>
> On Sun, Jun 14, 2009 at 12:27 PM, Robert Dailey <rcdailey_at_[hidden]> wrote:
>>
>> Hi everyone,
>> I did a bit of googling to see if Boost 1.39 as any portable support for
>> UTF-16 encoded strings, but I did not find any. I'm currently using
>> wxWidgets in my application, and I need a decent string object to use. I
>> know that wxWidgets has UTF-16 string support through wxString, however I do
>> not want to expose this object in my interfaces. I want to remain as
>> abstracted away from wxWidgets as possible. Having said that, if someone
>> could tell me if there is any existing UTF-16 string support in Boost, I'd
>> appreciate it. I did not find anything in the vault, sandbox, or trunk in
>> Boost.
>> If boost has no such string object, could someone give me a head start on
>> where to look? Thanks.
>

An application I currently work on is stricken with this. If (like
us) you are just trying to provide basic internationalization across
Windows and Linux and want it to "just work" and be simple, then I
would suggest typedefing something like

typedef std::wstring utf_string;
typedef boost::filesystem::wpath utf_path;
typedef wchar_t utf_char;

etc on windows, and

typedef std::string utf_string;
typedef boost::filesystem::path utf_path;
typedef char utf_char;

on Linux. Then just use a simple UTF-8 <-> UTF-16 conversion if ever
you need to persist / retrieve something, so that it's stored in a
common format. We're getting many strange problems relating to
locales when we try to use UTF-16 in wpaths on Linux, and if it's not
too much effort it's going to be simpler to just have your program
always store them in the native format that the OS is expecting.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net