Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-15 14:11:20


> >> Yet you still need to convert between UTF-8 and the POSIX locales.
> >> Even if most recent POSIX systems use UTF-8 as their locale, there is no
> >> guarantee of that.
> >> Indeed, quite a few still run in latin-1.
> >>
> >
> >
> > No you don't need convert UTF-8 to "locales" encoding as char* is native
> > system API unlike Windows one. So you don't need to mess around with
>encodings
> > at all unless you deal with text related stuff like for example collation.
>
> I'm not sure I follow. If you pass a UTF-8 encoded string to a POSIX OS
> that uses a non-UTF charater set, how is the OS meant to interpret that?
>

As a null terminated byte sequence, I mean if your locale is UTF-8
and there is a file with name "\xFF\xFF.txt" which is clearly not UTF-8
you can open it, remove it and do almost anything with it.

It is locale agnostic (unless it is very specific language related API like
strcoll)

Artyom

      


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk