|
Boost : |
Subject: Re: [boost] Silly Boost.Locale default narrow string encoding in Windows
From: Alf P. Steinbach (alf.p.steinbach+usenet_at_[hidden])
Date: 2011-10-29 13:07:56
On 29.10.2011 18:23, Daniel James wrote:
> On Saturday, 29 October 2011, Peter Dimov wrote:
>>
>>
>> The "dir" command has no problem displaying arbitrary file names directly
>> to the console (presumably via WriteConsoleW), but once it has to write to
>> a file, it needs to convert to narrow and no code page other than 65001 can
>> express the above file name.
>>
>
> This is not that relevant to the wider issue, but wide streams will work
> for console output if you first do this:
>
> if (_isatty(_fileno(stdout))) _setmode(_fileno(stdout), _O_U16TEXT);
> if (_isatty(_fileno(stderr))) _setmode(_fileno(stderr), _O_U16TEXT);
>
> i.e. set the output mode to UTF-16 when writing to the console. This only
> works for recent versions of Visual C++. Obviously doesn't fix piped output.
Right.
But the added 'if's produce another problem, namely that redirection to
a file is prevented from working.
<example>
P:\test> chcp 65001
Active code page: 65001
P:\test> type jam.cpp
#include <stdio.h>
#include <io.h> // _setmode
#include <fcntl.h> // _O_U8TEXT
int main()
{
//_setmode( _fileno( stdout ), _O_U8TEXT );
if( _isatty( _fileno( stdout ) ) )
{
_setmode( _fileno( stdout ), _O_U16TEXT );
}
::wprintf( L"BlÃ¥bærsyltetøy! æ¥æ¬å½ коÑка!\n" );
}
P:\test> cl jam.cpp
jam.cpp
P:\test> jam
BlÃ¥bærsyltetøy! æ¥æ¬å½ коÑка!
P:\test> jam >x
P:\test> type x
Bl�b�rsyltet�y!
P:\test> _
</example>
Without the added 'if's, and instead adding a Unicode BOM to the start
of the text, it works fine for redirection:
<example
P:\test> chcp 65001
Active code page: 65001
P:\test> type jam.cpp
#include <stdio.h>
#include <io.h> // _setmode
#include <fcntl.h> // _O_U16TEXT
int main()
{
_setmode( _fileno( stdout ), _O_U16TEXT );
::wprintf( L"\uFEFF" L"BlÃ¥bærsyltetøy! æ¥æ¬å½ коÑка!\n" );
}
P:\test> cl jam.cpp
jam.cpp
jam.cpp(8) : warning C4428: universal-character-name encountered in source
P:\test> jam
BlÃ¥bærsyltetøy! æ¥æ¬å½ коÑка!
P:\test> jam >x
P:\test> type x
BlÃ¥bærsyltetøy! æ¥æ¬å½ коÑка!
P:\test> chcp 437
Active code page: 437
P:\test> type x
BlÃ¥bærsyltetøy! æ¥æ¬å½ коÑка!
P:\test> _
</example>
UTF-8 is even more forgiving as an external format. You don't see the
BOM. Oh, I see that it's disappeared above, difficult to copy-paste, but
it's there in the direct output as a rectangle.
Cheers & hth.,
- Alf
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk