Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Patrick Horgan (phorgan1_at_[hidden])
Date: 2011-01-18 18:27:28


On 01/18/2011 10:27 AM, Matus Chochlik wrote:
> ...elision by patrick...
>
> Maybe a better course of action would be to create ansi_str_t with the encoding
> tags for the legacy ANSI-encoded strings, which could be obsoleted
> in the future, and use std::string as the default class for UTF-8 strings.
> We will have to do this transition anyway at one point, so why not do it now.
First, how annoying that that text mode on windows is called ANSI. It
has nothing to do with ANSI.
Second, I think you forget that it's a big world with large number of
single byte and multibyte encodings that will be in strings. It's just
self defense. If someone gives you something in a utf-8 string type,
you can make _some_ assumption, that absent error, it's supposed to be
that encoding. Other than that you can't. If a std::string _can_ be
many different things, then a std::string _will_ be many different
things. Partitioning the space of things it can be and dealing with
each of them correctly is a good thing, I think.

Patrick


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk