Boost logo

Boost :

Subject: Re: [boost] [strings][unicode] Proposals for Improved String Interoperability in a Unicode World
From: Artyom Beilis (artyomtnk_at_[hidden])
Date: 2012-01-29 03:13:01

> From: Yakov Galka <ybungalobill_at_[hidden]> > > [...] What probably should be done is that compilers should be compelled to >> support UTF-8 as the source character set in a unified way. >> > > Yes, it could be nice. It would solve half the problem, which is a huge > step forward given the current mood of the committee. However, embedding > Unicode string literals in source code is still not something you routinely > do. Internationalization usually uses external string tables. > Not right. Sometimes you do want non ASCII symbols in the source code, what is wrong to have © in the text or € symbol in the code. Also the fact that C++ does not define Unicode source code is standard design problem, there is nothing wrong to have Unicode literals in the source code. In fact the ONLY modern compiler that deos not suppor them is Vistual Studio, all others I had ever used (gcc, clang, intel, sunstudio) work fine with UTF-8. > I once asked volodya if it were feasible to implement this in the build >> system (add a BOM for MSVC), but he didn't seem to think it was worth > it. > > > I don't understand. MSVC already understands BOM, and GCC has already been > fixed according to >'t test it). > Few points. 1. BOM should not be used in source code, no compiler except MSVC uses it and most    do not support it.    BOM is totally stupid for UTF-8 as it does not have "byte order" so it should    just die for UTF-8. 2. Setting UTF-8 BOM makes narrow literals to be encoded in ANSI encoding which    makes BOM useless (crap... sory) with MSVC even more.   Artyom Beilis -------------- CppCMS - C++ Web Framework: CppDB - C++ SQL Connectivity:

Boost list run by bdawes at, gregod at, cpdaniel at, john at