Subject: Re: [boost] [strings][unicode] Proposals for Improved String Interoperability in a Unicode World
From: Artyom Beilis (artyomtnk_at_[hidden])
Date: 2012-01-29 03:13:01
> From: Yakov Galka <ybungalobill_at_[hidden]>
> [...] What probably should be done is that compilers should be compelled to
>> support UTF-8 as the source character set in a unified way.
> Yes, it could be nice. It would solve half the problem, which is a huge
> step forward given the current mood of the committee. However, embedding
> Unicode string literals in source code is still not something you routinely
> do. Internationalization usually uses external string tables.
Not right. Sometimes you do want non ASCII symbols in the source code,
what is wrong to have Â© in the text or â¬ symbol in the code.
Also the fact that C++ does not define Unicode source code is
standard design problem, there is nothing wrong to have
Unicode literals in the source code.
In fact the ONLY modern compiler that deos not suppor them is Vistual Studio,
all others I had ever used (gcc, clang, intel, sunstudio) work fine
> I once asked volodya if it were feasible to implement this in the build
>> system (add a BOM for MSVC), but he didn't seem to think it was worth
> I don't understand. MSVC already understands BOM, and GCC has already been
> fixed according to
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415(didn't test it).
1. BOM should not be used in source code, no compiler except MSVC uses it and most
Â Â do not support it.
Â Â BOM is totally stupid for UTF-8 as it does not have "byte order" so it should
Â Â just die for UTF-8.
2. Setting UTF-8 BOM makes narrow literals to be encoded in ANSI encoding which
Â Â makes BOM useless (crap... sory) with MSVC even more.
CppCMS - C++ Web Framework:Â Â http://cppcms.com/
CppDB - C++ SQL Connectivity: http://cppcms.com/sql/cppdb/
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk