Boost logo

Boost :

From: Daryle Walker (darylew_at_[hidden])
Date: 2001-09-26 10:39:06

on 9/25/01 7:17 PM, dietmar_kuehl_at_[hidden] wrote:

> Daryle Walker wrote:
>> I realized the Unicode problem this morning. I'll come up with a
>> prototype Unicode solution to put up in the vault. (It shouldn't
>> go into any XML sub-library since it would be needed for other
>> things, like C++ parsing.)
> I'm probably dense but can anybody please tell me what problem you
> are talking off at all? 'std::basic_filebuf' internally uses a
> code conversion facet which is intended to convert an external
> encoding into an internal representation. For example, a
> 'std::wistream' can read UTF-8, UTF-16, ..., at least if appropriate
> code conversion facets are available. If they aren't shipped by the
> vendor, they may be provided by the user.
> I can understand that there is problem if your library vendor does
> not ship UTF-8, UTF-16, ... code conversion facets and these might
> have a place in the Boost library. However, since nobody mentioned
> code conversion facets before, I fear something really bad is going
> to happen if people try addressing this problem...

We will need conversion facets eventually, but I'm talking about that
there's no guarantee that "char" and "wchar_t" represent ASCII/Latin-1 or
Unicode characters; there's not even a guarantee that the latter type can
fit an Unicode character. (Will (int)'A' == 65 or (long)L'A' == 65L?)

The UTF conversions would assume that the characters don't represent
themselves, but parts of the encoding. We would need to assume that the
characters are being read in binary mode.

Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT mac DOT com

Boost list run by bdawes at, gregod at, cpdaniel at, john at