Boost logo

Boost :

Subject: Re: [boost] Boost.Locale and the standard "message" facet
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-05-03 01:54:25


> From: Vicente BOTET <vicente.botet_at_[hidden]>
> >
> > If "File was opened {1} day ago" is not in dictionary that
> > it would be used as no Hebrew alternative provided, also
> > it would have 2 plural forms (as English) instead of
> > 3 (in Hebrew).
>
>
> I insists, could you show the catalog associated to this
> translation in English and in Hebrew? I'm sure
> I'm missing something and I don't reach to see what.

The hebrew catalog looks like

he.po

# translation of foo.po to Hebrew
"Project-Id-Version: foo\n"
"PO-Revision-Date: 2008-06-07 15:04+0300\n"
"Last-Translator: Artyom <artyomtnk_at_[hidden]>\n"
"Language-Team: Hebrew <en_at_[hidden]>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=3; plural= n==1 ? 0 : (n == 2 ? 1 : 2);\n"
"X-Generator: KBabel 1.11.4\n"

msgid "File was opened {1} day ago"
msgid_plural "File was opened {1} days ago"
msgstr[0] "Kovetz niftah lifney yom {1}"
msgstr[1] "Kovetz niftah lifney yomaim"
msgstr[2] "Kovetz niftah lifney {2} yamim"

The English is the original string.

In Japanese or Chinese (with one form) it would be

ja.po
...
"Plural-Forms: nplurals=1; plural=0;\n"
...
msgid "File was opened {1} day ago"
msgid_plural "File was opened {1} days ago"
msgstr[0] "Some Japanese text {1} some Japanese text"
...

The "Plural-Forms:" section of meta record describes
the format of plural forms: their number and equation
in C that calculates it from parameter, the facet
that loads catalog parses the C equation and know
how to calculate it.

>
> format(translate("Format date with H-M-S","{1}, {2}, {3}"))
> % format(translate("Format date with H-M-S","{1} hour","{1} hours"))
> % format(translate("Format date with H-M-S","{1} minute","{1} minutes"))
> % format(translate("Format date with H-M-S","{1} second","{1} seconds"))
>
> The single problem I see which character use to split the string. Maybe %
>could be used
>
>
> translate("%1 hours%,% %2 minutes%,% %3 seconds") % h % m % s
>

Not sure about it how exactly you want to do this? Would

format(translate("Format date with H-M-S",
    "[{1} hour|{1} hours], [{2} minute|{2} minutes], [{3} second|{3} seconds]",
    h,m,s)) %h % m % s

Is better? I don't know... Need to think about, because
there may be more corner cases that I don't see yet.

> > The C++0x had deprecated std::auto_ptr that everybody
> > uses and had given std::unique_ptr.
> >
> > You are suggesting to enforce bad design to
> > good facet just because it exists and nobody
> > uses it?
> >
> > I disagree. This std::messages facet should be
> > deprecated or even removed.
>
> No. I'm just telling that if you have valid arguments
> it will be better to deprecate one and add one that
> is better. But having two catalogs is not good.

Yes I recommend to deprecate.

> For example if I want to make Chrrno internationalizable
> I can just use Std facet message until there is a better facet.

And it would work only on... Linux

- gcc supports locales only on Linux
- MSVC does not support messages at all.

That is the sad reality.

>
> > In order to make useful TR2 proposal
> > you should do some groundbreaking and
> > do things like:
> >
> > 1. Standardize locale names
> > 2. Standardize messages catalogs formats
> > 3. Rewrite some of existing facets
> > completely
> > 4. Deprecate some of the facets and functions.
> >
> > The 3 and 4 are quite easy to do however the 1st
> > and the 2nd would be very hard if possible at all.
>
> So, are you saying that we can not have a other than implementation defined
>standard for localization?
>

I'm just telling it would not be simple at all,
especially that some ground breaking things like
using UTF-8 by default, using specific messages
catalog and so on.

Without it locales would remain useless as
they today.

>
> Well, having better facets could be one step ahead.
>

But it would not be enough.

See, if the state of the implementation (not interface)
of the current facets was good it would
be very-very-very useful part of C++
even in its limited way... But it isn't.

> > This what really concerns me in the standardization
> > of localization facilities.
>
> I really suggest you to participate on the
> standardization of a better locale library proposal,
> at the end this is also one of the goals of Boost.
>

Yes I understand and see and I think it may have
a chance.
 
  Artyom


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk