Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Dave Abrahams (dave_at_[hidden])
Date: 2011-01-18 13:39:10


At Tue, 18 Jan 2011 19:46:41 +0200,
Peter Dimov wrote:
>
> Dave Abrahams wrote:
> > At Tue, 18 Jan 2011 13:27:29 +0200,
> > Peter Dimov wrote:
> > >
> > > Dave Abrahams wrote:
> > >
> > > > I think the reason to use separate types is to provide a type-safety
> > > > barrier between your functions that operate on utf-8 and system or
> > > > 3rd-party interfaces that don't or may not. In principle, that should
> > > > force you to think about encoding and decoding at all the places where
> > > > it may be needed, and should allow you to code naturally and with
> > > > confidence where everybody is operating in utf8-land.
> > >
> > > Yes, in principle. It isn't terribly necessary if everybody is
> > > operating in UTF-8 land though.
> >
> > But they won't be. That's not today's reality.
>
> They should be, though. As a practical matter, the difference between
> taking/returning a string and taking/returning an utf8_t is to force
> people to write an explicit conversion. This penalizes people who are
> already in UTF-8 land because it forces them to use utf8_t( s,
> encoding_utf8 ) and s.c_str( encoding_utf8 ) everywhere, without any
> gain or need. It's true that for people whose strings are not UTF-8,
> forcing those explicit conversions may be considered a good thing. So
> it depends on what your goals are. Do you want to promote the use of
> UTF-8 for all strings, or do you want to enable people to remain in
> non-UTF-8-land?

Oh, I get it. Nevermind :-)

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk