Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Peter Dimov (pdimov_at_[hidden])
Date: 2011-01-18 06:27:29


Dave Abrahams wrote:

> I think the reason to use separate types is to provide a type-safety
> barrier between your functions that operate on utf-8 and system or
> 3rd-party interfaces that don't or may not. In principle, that should
> force you to think about encoding and decoding at all the places where
> it may be needed, and should allow you to code naturally and with
> confidence where everybody is operating in utf8-land.

Yes, in principle. It isn't terribly necessary if everybody is operating in
UTF-8 land though. It's a bit like defining a separate integer type for
nonnegative ints for type safety reasons - useful in theory, but nobody does
it.

If you're designing an interface that takes UTF-8 strings, it still may be
worth it to have the parameters be of a utf8-specific type, if you want to
force your users to think about the encoding of the argument each time they
call one of your functions... this is a legitimate design decision. If
you're in control of the whole program, though, it's usually not worth it -
you just keep everything in UTF-8.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk