Boost logo

Boost :

Subject: Re: [boost] Silly Boost.Locale default narrowstringencodinginWindows
From: Yakov Galka (ybungalobill_at_[hidden])
Date: 2011-10-28 08:41:23

On Fri, Oct 28, 2011 at 13:58, Peter Dimov <pdimov_at_[hidden]> wrote:

> Alf P. Steinbach wrote:
> How do I make the following program work with Visual C++ in Windows, using
>> narrow character string?
>> <code>
>> #include <stdio.h>
>> #include <fcntl.h> // _O_U8TEXT
>> #include <io.h> // _setmode, _fileno
>> #include <windows.h>
>> int main()
>> {
>> //SetConsoleOutputCP( 65001 );
>> //_setmode( _fileno( stdout ), _O_U8TEXT );
>> printf( "Blåbærsyltetøy! 日本国 кошка!\n" );
>> }
>> </code>
> Output to a console wasn't our topic so far (and is not one of my strong
> points), but the specific problem with this program is that the embedded
> literal is not UTF-8, as the warning C4566 tells us, so there is no way for
> you to get UTF-8 in the output. (You should be able to set VC++'s code page
> to 65001, but I don't think you can.)
> int main()
> {
> printf( utf8_encode( L"кошка" ).c_str() );
> }

You don't need to configure anything, in fact you cannot do it properly in
VS. What you can do is:

1) don't use wide-char literals with non ascii characters
2) use UTF-8 literals for narrow-char.

All you need is to save the source as UTF-8 WITHOUT BOM. Works as charm on
VS2005 and VS2010. Apparently it's portable. The IDE can detect UTF-8 even
without BOM ("☑ Auto-detect UTF-8 encoding without signature").

> This is not a practical problem for "proper" applications because Russian
> text literals should always come from the equivalent of gettext and never be
> embedded in code.


Personally I'm happy with

printf( "Blåbærsyltetøy! 日本国 кошка!\n" );

writing UTF-8. Even if I cannot configure the console, I still can redirect
it to a file, and it will correctly save this as UTF-8. Preventing data-loss
is more important for me.


Boost list run by bdawes at, gregod at, cpdaniel at, john at