Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Alexander Lamaison (awl03_at_[hidden])
Date: 2011-01-18 08:59:14

On Tue, 18 Jan 2011 08:48:59 -0500, Dave Abrahams wrote:

> At Tue, 18 Jan 2011 13:27:29 +0200,
> Peter Dimov wrote:
>> Dave Abrahams wrote:
>>> I think the reason to use separate types is to provide a type-safety
>>> barrier between your functions that operate on utf-8 and system or
>>> 3rd-party interfaces that don't or may not. In principle, that should
>>> force you to think about encoding and decoding at all the places where
>>> it may be needed, and should allow you to code naturally and with
>>> confidence where everybody is operating in utf8-land.
>> Yes, in principle. It isn't terribly necessary if everybody is
>> operating in UTF-8 land though.
> But they won't be. That's not today's reality.
>> It's a bit like defining a separate integer type for nonnegative
>> ints for type safety reasons - useful in theory, but nobody does it.
> I refer you to Boost.Units
>> If you're designing an interface that takes UTF-8 strings,
> we are...
>> it still may be worth it to have the parameters be of a
>> utf8-specific type, if you want to force your users to think about
>> the encoding of the argument each time they call one of your
>> functions...
> Or, you may want to use a UTF-8 specific type to force users of legacy
> char* interfaces (and ourselves) to think about decoding each time
> they call a legacy char* interfaces.
>> this is a
>> legitimate design decision. If you're in control of the whole program,
>> though, it's usually not worth it - you just keep everything in UTF-8.
> By definition, since we're library designers, we don't have said
> control. And people *will* be using whatever Boost does with "legacy"
> non-UTF-8 interfaces.

+1 for every point.


Easy SFTP for Windows Explorer (

Boost list run by bdawes at, gregod at, cpdaniel at, john at