Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Robert Ramey (ramey_at_[hidden])
Date: 2011-01-18 14:53:14


>> struct utf8string : public std::string {
>> struct iterator {
>> const char * operator++(); // move to next code point,
>> utf8char operator*(); // return next utf8 char etc.
>> // ...
>> };
>> // maybe some other stuff - e.g. trap non-sensical operations
>> };
>>
>> and while you're at it
>>
>> struct ascii_string : public std::string {
>> std::local m_l; //
>> ascii_string & operator+=(char c) {
>> assert(c < 128);
>> }
>> // etc.
>> };
>>
>> struct jis_string : public std::string {
>> // etc.
>> };
>>
>> and while your at it, if you've got nothing else to do
>>
>> struct ebcdc_string : public std::string {
>> ascii_string & operator+=(char c) {
>> assert(c < 128);
>> }
>> // etc.
>> };
>>
>> Just a thought.
>
> That instead of the currently used 2 string classes
> you'll end up with N string classes. That thought
> is not very appealing to me.

I don't think that's a fair statement. The above only has 4
and that's including EBCDIC.

Sorry, I don't get the "2" above.

In any case, one could state with Just utf8_string and
ansi_string (should be simple), put it into boost and see how
many people use it. If it's truely an improvement, usage of
std:string would atrophy to the point of being irrelevent. If
there are still reasons for using std::string directly, then it
wouldn't, but no harm would be done. This has all the upside
and none of the downside.

If this were made,


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk