Boost logo

Boost :

Subject: Re: [boost] [locale] [filesystem] Windows local 8 bit encoding
From: Thiel, Bjoern (bjoern.thiel_at_[hidden])
Date: 2012-11-01 10:43:28


________________________________________
From: boost-bounces_at_[hidden] [boost-bounces_at_[hidden]] on behalf of Artyom Beilis [artyomtnk_at_[hidden]]
Sent: Thursday, November 01, 2012 09:57
To: boost_at_[hidden]
Subject: Re: [boost] [locale] [filesystem] Windows local 8 bit encoding

>________________________________
> From: "Thiel, Bjoern" <bjoern.thiel_at_[hidden]>
>To: "boost_at_[hidden]" <boost_at_[hidden]>
>Sent: Wednesday, October 31, 2012 4:07 PM
>Subject: [boost] [locale] [filesystem] Windows local 8 bit encoding

Hi,

> >Hi,
> >
> >developing platform independent code I really like the convenience functions
> >conv::to_utf, conv::from_utf, and conv::utf_to_utf from locale.
> >Why not add something like conv::local8bit_to_utf and conv::local8bit_from_utf
>
> First of all locale encoding is not constant, for example there are numerous
> way to change locale
>
> [...]
>
> Thus the "concept" of the OS locale is quite uncertain and not well
> defined especially under Microsoft Windows.

Right

> Using Boost.Locale you can convert to locale encoding of a given
> std::locale() object generated with Boost.Locale.
>
> boost::locale::generator allows to select legacy "ANSI" encoding
> instead of UTF-8 to be default upon creation of the locale object that
> corresponds to the system locale.
>
> This object you can use with to_utf and from_utf functions.

Unfortunately that does not work under Microsoft Windows as
  generator locale_generator ;
  locale_generator.use_ansi_encoding( true ) ;
  std::locale const current_locale = locale_generator.generate( name ) ;
needs a name.

If I use the application locale name
  std::string const name = std::locale().name() ;
I get "C" which gives me "US-ASCII" encoding and not the "windows-1252"
encoding I have.

Even if I use the system locale name
  std::string const name = std::locale( "" ).name() ;
I get "English_United States.1252" which gives me the codepage "1252"
as encoding and not "windows-1252" either (conv::to_utf and conv::from_utf
just throw "Invalid or unsupported charset:1252" in this case).

> [...]
>
> So if you want to write cross platform software stick to UTF-8
> and on the boundary of Win32 API convert it to Wide API
> which is the native Windows API and the correct one to use.

Actually I'm trying to make a shared object (a dll) platform independent
that has to do some character conversions according to the current application
locale.

Best regards

Bjoern.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk