Boost logo

Boost :

Subject: Re: [boost] [locale] Formal review of Boost.Locale library EXTENDED
From: Ryou Ezoe (boostcpp_at_[hidden])
Date: 2011-04-20 03:35:50


On Wed, Apr 20, 2011 at 1:54 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>> From: Ryou Ezoe <boostcpp_at_[hidden]>
>> To: boost_at_[hidden]
>> Sent: Tue, April 19, 2011 3:58:00 PM
>> Subject: Re: [boost] [locale] Formal review of Boost.Locale library EXTENDED
>>
>> On Tue, Apr 19, 2011 at 9:25 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>>
>> >  1. How do you read man-pages?
>> We read translation.
>> > 2. How do you read  MSDN docs?
>> We read translation.
>> > 3. How do you read the documentation  of Boost Libraries?
>> We read translation.
>> > 4. How do you solve problems  you hand't seen? Do you
>> >   really can find all answers in Google in  Japanese?
>> There is high chance some Japanese programmer solved it or  read
>> English paper and explained it in Japanese.
>> > 5. How do you talk  to customers outside Japan?
>> Japanese programmer don't talk to foreign  customers.
>> That is a job for the translator.
>>
>> Isn't it obvious from the  fact that your debian has so many Japanese
>> translation?
>> We have enough  Japanese translations to learn programming without
>> knowing English.
>> Why  there are so many translation? Because we need it.
>> If average Japanese  programmer can read English, we don't need such
>> amount of  translations.
>>
>
> You know, I have a solution for you.
>
> This solution works because gettext accepts
> arbitrary char * as key, even when new versions
> warn about non-ASCII strings.

warn about non-ASCII strings?
For what? As being a library that stick with non-portable ASCII when
next C++ standard officially support UTF-8, UTF-16 and UTF-32?
I lost all my faith on this library.
Don't claim it support UTF-8.
UTF-8 that reject non-ASCII characters is ASCII.

>
> So basically wgettext("日本語") would work when
> the string in the dictionary...
>
> So there is a "solution" that you can adopt:
>
> Solutuon A:
> --------------------------------------------------------------------------
>
> template<typename CharType>
> std::basic_string<CharType> basic_xtranslate(char const *msg,std::locale const
> &l=std::locale())
> {
>    typedef boost::locale::message_format<CharType> facet_type;
>    CharType const *translated = 0;
>    if(std::has_facet<facet_type>(l)
>      && (translated=std::use_facet<facet_type>(l).(0,0,msg))!=0)
>    {
>        return translated;
>    }
>    // Will be replaced in utf_to_utf
>    return boost::locale::conv::to_utf<CharType>(msg,"UTF-8");
> }
>
> inline std::wstring wxtranslate(char const *msg,std::locale const
> &l=std::locale())
> {
>  return basic_xtranslate<wchar_t>(msg,l);
> }
>
>
> Solution B
> .......................................
>
> typedef std::pair<char const *,wchar_t const *> dual_message_type
>
> inline dual_message_type make_dual_message(char const *n,wchar_t const *w)
> {
>  return dual_message_type(n,w);
> }
>
> #define WTR(m)  (make_dual_mesage(m,L##m))
>
> std::wstring wxtranslate(dual_message_type const &msg,std::locale const
> &l=std::locale())
> {
>    typedef boost::locale::message_format<wchar_t> facet_type;
>    wchar_t const *translated = 0;
>    if(std::has_facet<facet_type>(l)
>      && (translated=std::use_facet<facet_type>(l).(0,0,msg.first))!=0)
>    {
>        return translated;
>    }
>    return msg.second
> }
>
>
> ----------------------------------------------------------------
>
> And now you can freely write:
>
> a)   wxtranslate("日本語")
>
> or
>
> b)   wxtranslate(WTR("日本語"))
>
>
>
>
> However
>
> a) In first case you will have to make sure
>   that the sources are UTF-8 (and as you had said
>   they may be not) or other encoding
>   but it should be constant at compilation
>   time.
>
>   And it has run time penalty on case were
>   the string is not in the dictionary
>
> b) In second case you will have to make sure
>   that MSVC handles L"日本語" correctly.
>
> This can be extended for plural forms, context
> support and domains.
>
> But this isn't going to be part of Boost.Locale
> as such code would bite you at some point
> very hard.

I don't understand what are you trying to solve by that so called solutions.

Solution A does not work at all.
There is no guarantee ordinary string literal is UTF-8 encoded.(and it
always isn't in MSVC).

Solution B... What are you doing?
Isn't wxtranslate(WTR("日本語")) ended up pointer to const wchar_t that
refers to L"日本語" ?
It does nothing except it works as a macro which add L encoding prefix.
If so, I'd rather write L"日本語" directly.

Since translate() expect nothing but ASCII.
I suggest you should clearly stated that in the document.
You should also throw an exception when you detect a non-ASCII
character in the argument of translate.

That way, you are safe from the real world problem.

-- 
Ryou Ezoe

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk