Boost logo

Boost :

From: Daryle Walker (darylew_at_[hidden])
Date: 2001-09-26 10:39:06


on 9/25/01 7:17 PM, dietmar_kuehl_at_[hidden] wrote:

> Daryle Walker wrote:
>> I realized the Unicode problem this morning. I'll come up with a
>> prototype Unicode solution to put up in the vault. (It shouldn't
>> go into any XML sub-library since it would be needed for other
>> things, like C++ parsing.)
>
> I'm probably dense but can anybody please tell me what problem you
> are talking off at all? 'std::basic_filebuf' internally uses a
> code conversion facet which is intended to convert an external
> encoding into an internal representation. For example, a
> 'std::wistream' can read UTF-8, UTF-16, ..., at least if appropriate
> code conversion facets are available. If they aren't shipped by the
> vendor, they may be provided by the user.
>
> I can understand that there is problem if your library vendor does
> not ship UTF-8, UTF-16, ... code conversion facets and these might
> have a place in the Boost library. However, since nobody mentioned
> code conversion facets before, I fear something really bad is going
> to happen if people try addressing this problem...

We will need conversion facets eventually, but I'm talking about that
there's no guarantee that "char" and "wchar_t" represent ASCII/Latin-1 or
Unicode characters; there's not even a guarantee that the latter type can
fit an Unicode character. (Will (int)'A' == 65 or (long)L'A' == 65L?)

The UTF conversions would assume that the characters don't represent
themselves, but parts of the encoding. We would need to assume that the
characters are being read in binary mode.

-- 
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT mac DOT com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk