Boost logo

Boost :

Subject: Re: [boost] Boost.Locale (was Re: [SQL-Connectivity] Is Boost interested in CppDB?)
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2010-12-14 10:24:11


On Tue, Dec 14, 2010 at 11:08 PM, Eric Niebler <eric_at_[hidden]> wrote:
> On 12/14/2010 9:53 AM, Dean Michael Berris wrote:
>> +1 -- if there was a library that did easy conversion from
>> std::wstring (usually the default in Windows now) to proper UTF-8
>> encoded std::string in Boost that would be *awesome*. I can totally
>> use that library in cpp-netlib too. ;)
>
> Please, no. std::string is not an appropriate holder for a UTF-8 string.
> It encourages random-access mutation of any byte in a UTF-8 sequence,
> pretty much guaranteeing data corruption.
>

Yes, in a perfect world we would have a sane string implementation
that knew how to handle UTF-8 natively. I guess C++0x should fix that,
but until then, there are libraries that deal with UTF-8 from a `char
const *` and subsequently std::string's that contain UTF-8 encoded
strings. Until all those libraries get re-written to use a better
string implementation, a converter from std::wstring -> UTF-8 encoded
std::string (or something else convertible to an std::string) would be
really useful -- even if the possibility and likelihood that
corruptive string operations on UTF-8 std::string's is likely.

I would really not like to write a library like this myself which is
why I'm really hoping there's something out there that's easy to use
that provides STL-like container interfaces to UTF-8 encoded strings.
If that string is not an std::string doesn't really matter much as
long as I can get to use it now almost the same way I'd use an
std::string. ;)

Are you working on something like that Eric? :D

-- 
Dean Michael Berris
deanberris.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk