Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Matus Chochlik (chochlik_at_[hidden])
Date: 2011-01-18 13:27:08
On Tue, Jan 18, 2011 at 6:46 PM, Peter Dimov <pdimov_at_[hidden]> wrote:
> Dave Abrahams wrote:
>> At Tue, 18 Jan 2011 13:27:29 +0200,
>> Peter Dimov wrote:
>> But they won't be. Â That's not today's reality.
> They should be, though. As a practical matter, the difference between
> taking/returning a string and taking/returning an utf8_t is to force people
> to write an explicit conversion. This penalizes people who are already in
> UTF-8 land because it forces them to use utf8_t( s, encoding_utf8 ) and
> s.c_str( encoding_utf8 ) everywhere, without any gain or need. It's true
> that for people whose strings are not UTF-8, forcing those explicit
> conversions may be considered a good thing. So it depends on what your goals
> are. Do you want to promote the use of UTF-8 for all strings, or do you want
> to enable people to remain in non-UTF-8-land?
Boost, as the cutting edge C++ library should try to enforce new standards
and not dwell on old and obsolete ones. Today everybody is (maybe slowly)
moving towards UTF-8 and creating a new string class/wrapper for UTF-8 that
nobody uses, IMO, encourages the usage of the old ANSI encodings.
Maybe a better course of action would be to create ansi_str_t with the encoding
tags for the legacy ANSI-encoded strings, which could be obsoleted
in the future, and use std::string as the default class for UTF-8 strings.
We will have to do this transition anyway at one point, so why not do it now.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk