Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Matus Chochlik (chochlik_at_[hidden])
Date: 2011-01-18 13:27:08


On Tue, Jan 18, 2011 at 6:46 PM, Peter Dimov <pdimov_at_[hidden]> wrote:
> Dave Abrahams wrote:
>>
>> At Tue, 18 Jan 2011 13:27:29 +0200,
>> Peter Dimov wrote:
>>
>> But they won't be.  That's not today's reality.
>
> They should be, though. As a practical matter, the difference between
> taking/returning a string and taking/returning an utf8_t is to force people
> to write an explicit conversion. This penalizes people who are already in
> UTF-8 land because it forces them to use utf8_t( s, encoding_utf8 ) and
> s.c_str( encoding_utf8 ) everywhere, without any gain or need. It's true
> that for people whose strings are not UTF-8, forcing those explicit
> conversions may be considered a good thing. So it depends on what your goals
> are. Do you want to promote the use of UTF-8 for all strings, or do you want
> to enable people to remain in non-UTF-8-land?
>
+1

Boost, as the cutting edge C++ library should try to enforce new standards
and not dwell on old and obsolete ones. Today everybody is (maybe slowly)
moving towards UTF-8 and creating a new string class/wrapper for UTF-8 that
nobody uses, IMO, encourages the usage of the old ANSI encodings.

Maybe a better course of action would be to create ansi_str_t with the encoding
tags for the legacy ANSI-encoded strings, which could be obsoleted
in the future, and use std::string as the default class for UTF-8 strings.
We will have to do this transition anyway at one point, so why not do it now.

my 0.02€

regards

Matus


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk