Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-01-19 08:07:21

On Wed, 19 Jan 2011 10:26:05 +0100
Matus Chochlik <chochlik_at_[hidden]> wrote:

>>> That instead of the currently used 2 string classes you'll end up
>>> with N string classes. That thought is not very appealing to me.
>> I don't think that's a fair statement.  The above only has 4 and
>> that's including EBCDIC.
> But those four are not the only widespread encoding schemes, what
> about KOI8, CPXYZ, etc.

There wouldn't be any need for special string types for them. They
would be represented by native_t if the system is set to use them, and
std::string types would just be assumed to be coded in that form.

>> In any case, one could state with Just utf8_string and ansi_string
>> (should be simple), put it into boost and see how many people use
>> it.  If it's truely an improvement, usage of std:string would
>> atrophy to the point of being irrelevent.  If there are still
>> reasons for using std::string directly, then it wouldn't, but no
>> harm would be done. This has all the upside and none of the downside.
>> If this were made,
> One of the downsides is that C++ would be abandoning a nice name
> 'string' to ugly 'utf8_t' or whatever.

Believe it or not, you'd get used to it. :-) I thought wchar_t was the
height of ugliness when I first saw it, but it seems perfectly
acceptable now, even attractively descriptive.

Chad Nelson
Oak Circle Software, Inc.

Boost list run by bdawes at, gregod at, cpdaniel at, john at