|
Boost : |
Subject: Re: [boost] [strings][unicode] Proposals for Improved String Interoperability in a Unicode World
From: Daryle Walker (darylew_at_[hidden])
Date: 2012-01-31 03:57:58
----------------------------------------
> Date: Mon, 30 Jan 2012 00:24:30 -0800
> From: Artyom
>
> ----- Original Message -----
> > From: Beman Dawes <bdawes_at_[hidden]>
> >
> >> What probably should be done is that compilers should be compelled to
> >> support UTF-8 as the source character set in a unified way.
> >
> > Makes sense to me.
> >
> > Why don't you write up an issue for the C and C++ committees? My
> >
> > [snip]
> >
> > Another possibility is to start lobbying compiler vendors, or at least
> > Microsoft, to support UTF-8 both with and without BOM.
> >
>
> It is not only BOM not BOM issue. It is mostly the ability
> to define execution character set. i.e. character set for
> normal "some text" literals and the input character set
> and what is even more important that C++ compilers must
> support UTF-8 for the two of them.
This probably isn't the right post to respond to, but I don't want to spend forever figuring it out.
Not every system is a 8/16/32(/64)-bit computer using ASCII/Latin-1/UTF-8. C++ (from C) was designed so a user with a 9/36/81-bit EBSDIC system and one with a 8/16/32/64 UTF-16 system can write programs for the other (with the appropriate cross-compiler). We don't want to obnoxiously be prejudiced against systems not matching the current configuration trends.
(I was originally going to write "9/36/72", but then realized that higher types only have to be a multiple of char, not each other, so my new system breaks more common-programmer assumptions. BTW, that's 9-bit bytes (char), 36-bit words (short and int), and 81-bit long-words (long and long-long). I wonder if anyone here can fabricate this custom hardware, to mess people up.)
Daryle W.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk