Boost logo

Boost :

Subject: Re: [boost] Boost.Locale (was Re: [SQL-Connectivity] Is Boost interested in CppDB?)
From: Joel de Guzman (joel_at_[hidden])
Date: 2010-12-14 18:28:14


On 12/15/2010 7:13 AM, Mathias Gaunard wrote:
> On 14/12/2010 22:05, Edward Diener wrote:
>> On 12/14/2010 2:27 PM, Mathias Gaunard wrote:
>>> On 14/12/2010 16:08, Eric Niebler wrote:
>>>> On 12/14/2010 9:53 AM, Dean Michael Berris wrote:
>>>>> +1 -- if there was a library that did easy conversion from
>>>>> std::wstring (usually the default in Windows now) to proper UTF-8
>>>>> encoded std::string in Boost that would be *awesome*. I can totally
>>>>> use that library in cpp-netlib too. ;)
>>>>
>>>> Please, no. std::string is not an appropriate holder for a UTF-8 string.
>>>> It encourages random-access mutation of any byte in a UTF-8 sequence,
>>>> pretty much guaranteeing data corruption.
>>>>
>>>
>>> It is, however, an appropriate holder for the *data* of a UTF-8 string.
>>
>> Does not C++0x define character types for Unicode characters ? Would not
>> a basic_string<utf8> ( or whatever it is called ) character type be a
>> better choice than basic_string<char> if that is the case, if the former
>> existed ?
>
> While C++0x introduces char16_t and char32_t, meant for UTF-16 and UTF-32 respectively,
> there is no special character type dedicated to UTF-8.

UTF-8 is variable length encoded (so is UTF-16). basic_string
and string are unsuitable for any variable length encoded data,
as Eric pointed out.

Regards,

-- 
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk