Boost logo

Boost :

Subject: [boost] [locale] Support of non US-ASCII character set for messages keys
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-27 16:17:42


Hello, After reviewing all the discussion I've decided to do following changes in the interface to provide better support for non-US-ASCII keys. The actual thing that convinced me is a requirement to be able to include chars like © into the text... Currently there are following classes: template<typename CharType> class message_format : public std::locale::facet { public: ... typedef CharType char_type; virtual char_type const *get(int domain_id,char const *context,char const *id) const = 0; ... }; class message { public: ... explicit message(char const *id); ... // convert message to localized message template<typename CharType> std::basic_string<CharType> str(std::locale const &locale) const; }; ... inline message translate(char const *id); inline std::string gettext(char const *id,std::locale const &loc=std::locale()); inline std::wstring wgettext(char const *id,std::locale const &loc=std::locale()); ... Basically message is created using narrow id only and can be converted to multiple output formats narrow, wide and so on. std::cout << translate("Hello") << std::endl std::wcout << translate("Hello") << std::endl; And you could call: message msg = translate("Hello"); std::string hello = msg.str<char>(); std::wstring whello = msg.str<wchar_t>(); Work together. I'll change it in following way: template<typename CharType> class message_format : public std::locale::facet { public: ... typedef CharType char_type; virtual char_type const *get(int domain_id,char_type const *context,char_type const *id) const = 0; ... }; template<typename CharType> class basic_message { public: typedef CharType char_type; typedef std::basic_string<char_type> string_type; ... explicit message(char_type const *id); ... // convert message to localized message string_type str(std::locale const &locale) const; }; typedef basic_message<char> message; typedef basic_message<wchar_t> wmessage; typedef basic_message<char16_t> u16message; typedef basic_message<char32_t> u32message; ... inline message translate(char const *id); inline wmessage translate(wchar_t const *id); inline std::string gettext(char const *id,std::locale const &loc=std::locale()); inline std::wstring wgettext(wchar_t const *id,std::locale const &loc=std::locale()); ... Now you would have to: std::cout << translate("Hello") << std::endl std::wcout << translate(L"Hello") << std::endl; And you should call: message msg = translate("Hello"); wmessage wmsg = translate(L"Hello"); std::string hello = msg.str(); std::wstring whello = msg.str(); Additionally you would be able to specify the encoding of the source strings when adding domain. boost::locale::generator gen; gen.add_messages_domain("myprogram","windows-936"); While the default would always be UTF-8. So if you write in the program: std::cout << translate("平和") << std::cout Under GCC using UTF-8 sources you have anythig to do. If you are using MSVC then you'll have to provide a charset name as shown above or use u8"平和" Of course this would break the API for users who currently use Boost.Locale (and I know at least several project who will suffer). But this would probably bring it so some logical point and prevent rising these questions. If course you should remember that untranslated non-US-ASCII strings would be converted in the run-time to current locale's encoding. Regards, Artyom Beilis P.S.: Of course the documentation will still discourage programmers from using non-US-ASCII keys as they may not be displayed properly in local character sets and may confuse users.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk