|
Boost Users : |
Subject: Re: [Boost-users] [boost][exception] Wide-character design considerations
From: Rainer Deyke (rainerd_at_[hidden])
Date: 2010-06-24 22:55:22
On 6/24/2010 09:31, John Dlugosz wrote:
> No. The narrow form should be encoded based on the currently
> selected locale's settings. That is, after all, the whole point of
> having it. Other functions you pass it to will be expecting that.
Which other function? Yours? Mine? The standard libraries? Third
party libraries?
My functions, at least, use utf-8 exclusively. The standard library
generally (but not always) treats strings as opaque blobs of binary
data. Third party libraries vary, but most of the libraries I use
assume or at least support utf-8.
Using locale-dependent character encodings is just plain broken. It
won't work if you deal with characters that are not in the current
locale. It won't work if you save a file in one locale and load it from
another. It won't work if you compile with one locale and run on the other.
Unfortunately, wide strings are /also/ broken. On some platforms, they
are utf-32. On some platforms, they are utf-16. On some platforms they
are UCS-2, and you have no way to encode characters past the BMP. You
also have to worry about byte-order issues.
Utf-8 is the only sane way to deal with international text.
-- Rainer Deyke - rainerd_at_[hidden]
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net