Boost logo

Boost :

Subject: Re: [boost] Boost.Locale and the standard "message" facet
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-05-03 01:54:25

> From: Vicente BOTET <vicente.botet_at_[hidden]>
> >
> > If "File was opened {1} day ago" is not in dictionary that
> > it would be used as no Hebrew alternative provided, also
> > it would have 2 plural forms (as English) instead of
> > 3 (in Hebrew).
> I insists, could you show the catalog associated to this
> translation in English and in Hebrew? I'm sure
> I'm missing something and I don't reach to see what.

The hebrew catalog looks like


# translation of foo.po to Hebrew
"Project-Id-Version: foo\n"
"PO-Revision-Date: 2008-06-07 15:04+0300\n"
"Last-Translator: Artyom <artyomtnk_at_[hidden]>\n"
"Language-Team: Hebrew <en_at_[hidden]>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=3; plural= n==1 ? 0 : (n == 2 ? 1 : 2);\n"
"X-Generator: KBabel 1.11.4\n"

msgid "File was opened {1} day ago"
msgid_plural "File was opened {1} days ago"
msgstr[0] "Kovetz niftah lifney yom {1}"
msgstr[1] "Kovetz niftah lifney yomaim"
msgstr[2] "Kovetz niftah lifney {2} yamim"

The English is the original string.

In Japanese or Chinese (with one form) it would be

"Plural-Forms: nplurals=1; plural=0;\n"
msgid "File was opened {1} day ago"
msgid_plural "File was opened {1} days ago"
msgstr[0] "Some Japanese text {1} some Japanese text"

The "Plural-Forms:" section of meta record describes
the format of plural forms: their number and equation
in C that calculates it from parameter, the facet
that loads catalog parses the C equation and know
how to calculate it.

> format(translate("Format date with H-M-S","{1}, {2}, {3}"))
> % format(translate("Format date with H-M-S","{1} hour","{1} hours"))
> % format(translate("Format date with H-M-S","{1} minute","{1} minutes"))
> % format(translate("Format date with H-M-S","{1} second","{1} seconds"))
> The single problem I see which character use to split the string. Maybe %
>could be used
> translate("%1 hours%,% %2 minutes%,% %3 seconds") % h % m % s

Not sure about it how exactly you want to do this? Would

format(translate("Format date with H-M-S",
    "[{1} hour|{1} hours], [{2} minute|{2} minutes], [{3} second|{3} seconds]",
    h,m,s)) %h % m % s

Is better? I don't know... Need to think about, because
there may be more corner cases that I don't see yet.

> > The C++0x had deprecated std::auto_ptr that everybody
> > uses and had given std::unique_ptr.
> >
> > You are suggesting to enforce bad design to
> > good facet just because it exists and nobody
> > uses it?
> >
> > I disagree. This std::messages facet should be
> > deprecated or even removed.
> No. I'm just telling that if you have valid arguments
> it will be better to deprecate one and add one that
> is better. But having two catalogs is not good.

Yes I recommend to deprecate.

> For example if I want to make Chrrno internationalizable
> I can just use Std facet message until there is a better facet.

And it would work only on... Linux

- gcc supports locales only on Linux
- MSVC does not support messages at all.

That is the sad reality.

> > In order to make useful TR2 proposal
> > you should do some groundbreaking and
> > do things like:
> >
> > 1. Standardize locale names
> > 2. Standardize messages catalogs formats
> > 3. Rewrite some of existing facets
> > completely
> > 4. Deprecate some of the facets and functions.
> >
> > The 3 and 4 are quite easy to do however the 1st
> > and the 2nd would be very hard if possible at all.
> So, are you saying that we can not have a other than implementation defined
>standard for localization?

I'm just telling it would not be simple at all,
especially that some ground breaking things like
using UTF-8 by default, using specific messages
catalog and so on.

Without it locales would remain useless as
they today.

> Well, having better facets could be one step ahead.

But it would not be enough.

See, if the state of the implementation (not interface)
of the current facets was good it would
be very-very-very useful part of C++
even in its limited way... But it isn't.

> > This what really concerns me in the standardization
> > of localization facilities.
> I really suggest you to participate on the
> standardization of a better locale library proposal,
> at the end this is also one of the goals of Boost.

Yes I understand and see and I think it may have
a chance.

Boost list run by bdawes at, gregod at, cpdaniel at, john at