Boost logo

Boost :

Subject: Re: [boost] "Best so far" for C level i/o of Unicode text with Windows console
From: Alf P. Steinbach (alf.p.steinbach+usenet_at_[hidden])
Date: 2011-11-03 19:38:34


On 04.11.2011 00:14, Stephan T. Lavavej wrote:
> [Alf P. Steinbach]
>> I found that the Visual C++ implementation of the C library i/o
>> generally does not support console input of international characters. It
>> can deal with narrow character input from the current codepage, if that
>> codepage is not UTF-8.
>
> Changing the console's codepage isn't the right magic. See
> http://blogs.msdn.com/b/michkap/archive/2008/03/18/8306597.aspx

Thanks!

:-)

But did you notice that the article you replied to, had a link to that
exact page, and contained code using the technique Kaplan describes?

> With _O_U16TEXT, VC8+ can write Unicode to the console perfectly. However,
> I believe that input was broken up to and including VC10, and that it's
> been fixed in VC11.

Nope, sorry.

<code>
#include <stdio.h>

int main()
{
     printf( "? " );
     char buffer[80];
     scanf( "%s", buffer );
     printf( "%s\n", buffer );
}
</code>

<result>
D:\dev\test> (cl 2>&1) | find /i "c++"
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 17.00.40825.2 for
80x86

D:\dev\test> set CL=/nologo /EHsc /GR /W4

D:\dev\test> cl utf8test.cpp
utf8test.cpp
utf8test.cpp(7) : warning C4996: 'scanf': This function or variable may
be unsafe. Consider using scanf_s instead. To di
sable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
         C:\Program Files (x86)\Microsoft Visual Studio
11.0\VC\INCLUDE\stdio.h(290) : see declaration of 'scanf'

D:\dev\test> chcp 1252
Active code page: 1252

D:\dev\test> utf8test
? abcæøå
abcæøå

D:\dev\test> chcp 65001
Active code page: 65001

D:\dev\test> utf8test
? abcæøå
1?

D:\dev\test> _
</result>

> (I don't know about UTF-8. For reasons that are still mysterious to me, UTF-8
> typically isn't handled as well as people expect it to be. Windows really
> really likes UTF-16 for Unicode. In practice, this is not a big deal, because
> UTF-8 and UTF-16 are losslessly convertible.)

I think you're right that it's not a big deal for professional software
development, because what professional developer depends on correct
input of international characters from a Windows console window?

Not me... (But then it's been some years since I was a prof. dev.)

But I think it is important that students should be able to write the
same kinds of program code, as they will later do as professionals. And
for that reason it would be Really Nice if the Visual C++ runtime
library is able to deal with interactive UTF-8 input. Note that if
standard input is redirected to come from file, then it works OK.

Cheers, & thanks for helping,

- Alf


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk