|
Boost : |
Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Alexander Lamaison (awl03_at_[hidden])
Date: 2011-01-18 08:59:14
On Tue, 18 Jan 2011 08:48:59 -0500, Dave Abrahams wrote:
> At Tue, 18 Jan 2011 13:27:29 +0200,
> Peter Dimov wrote:
>>
>> Dave Abrahams wrote:
>>
>>> I think the reason to use separate types is to provide a type-safety
>>> barrier between your functions that operate on utf-8 and system or
>>> 3rd-party interfaces that don't or may not. In principle, that should
>>> force you to think about encoding and decoding at all the places where
>>> it may be needed, and should allow you to code naturally and with
>>> confidence where everybody is operating in utf8-land.
>>
>> Yes, in principle. It isn't terribly necessary if everybody is
>> operating in UTF-8 land though.
>
> But they won't be. That's not today's reality.
>
>> It's a bit like defining a separate integer type for nonnegative
>> ints for type safety reasons - useful in theory, but nobody does it.
>
> I refer you to Boost.Units
>
>> If you're designing an interface that takes UTF-8 strings,
>
> ...as we are...
>
>> it still may be worth it to have the parameters be of a
>> utf8-specific type, if you want to force your users to think about
>> the encoding of the argument each time they call one of your
>> functions...
>
> Or, you may want to use a UTF-8 specific type to force users of legacy
> char* interfaces (and ourselves) to think about decoding each time
> they call a legacy char* interfaces.
>
>> this is a
>> legitimate design decision. If you're in control of the whole program,
>> though, it's usually not worth it - you just keep everything in UTF-8.
>
> By definition, since we're library designers, we don't have said
> control. And people *will* be using whatever Boost does with "legacy"
> non-UTF-8 interfaces.
+1 for every point.
Alex
-- Easy SFTP for Windows Explorer (http://www.swish-sftp.org)
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk