|
Boost : |
Subject: Re: [boost] Boost.Locale and the standard "message" facet
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-05-01 14:21:26
> From: Steve Bush <sb2_at_[hidden]>
>
> >Use of integer identifiers is the best
> >way to screw the localization in the software.
>
> >What does 3456 means? Do you really think
> >it is good to write translate(MY_MESSAGE_OPENING_FILE)
>
> >No, never - never - never - never - never
> >use such "constant" or "integer" identifiers.
>
> >Always use natural text.
>
> I am not sure I fully understand this, but definitely I disagree with the
> idea that messages should never be integer identifiers.
>
> There is a principle in the database world that primary keys to records
> should be meaningless where the meaning can change over time. Imagine a text
> identifier "Close your hatch now" and over time, the very concept of "hatch"
> becomes meaningless - yet for all time the source code is condemned to have
> the original, now meaningless if not downright confusing identifier "Close
> your hatch now".
>
See notes below.
> In the windows world messages are generally numbers and there is
> considerable value in being able to search the web/documentation for error
> -23756472536476523 instead of some local language string which will only
> turn up comments in one language.
>
Note, same for things like errno and strerror - the error is represented
by the code and strerror (usually) converts it to the natural text...
However it is not always correct as strerror may add more information
to text then just simple key-value lookup.
So even in case of error codes the error code is useful for representing
a condition for the program while the text itself may be generated
in different way.
So basically it should be:
switch(error_code) {
case EINVAL:
if(...)
return gettext("first parameter is null")
else if (...)
return gettext("The range is invalid");
}
Same works for many other APIs. Consider
int status = somesql_prepare_query(conn,"SELECT * FROM fooo");
Now even if the status is SOMESQL_PREPARATION_FAILED the message
somesql_strerror(conn)
May actually return:
"Unknown table `fooo'"
The fact that it is used in Windows API and actually it
is used by some (legacy) localization systems does
not mean that this is the way it should be.
See description below.
> Any text compiled into a program is essentially a constant exactly like any
> integer. Therefore the same rules apply equally to
>
Not every rule that is applicable to the software applicable for
human interface and Natural-Languages.
> Generally I think the idea of compiling any meaningful text whatsoever into
> object code is questionable from a theoretical basis and is usually just a
> hangover from when gettext/translate was a quick and dirty way to largely
> automate the localisation of existing mono-lingual programs - by simply
> wrapping all quoted text with a call to translate.
>
No, it is not.
Having natural language identifier has following important
advantages:
1. It is promised that the meaning of the text and the translation
is always synchronized.
2. It makes code much more readable
3. It makes code much more maintainable
4. It makes it easier to detach actual
translator from the source code.
All modern localization system provide natural language
identifiers. And "constants" should never be used
for message formatting.
And this is not only my opinion by also the opinion
of many people who actually deal with localization.
Compare the code:
source.cpp
MessageBox(translate(MSG_OPEN_FILE_TITLE),translate(MSG_OPEN_IMAGE_FILE_WEB));
resource.h
#define MSG_OPEN_FILE_TITLE 1
#define MSG_OPEN_IMAGE_FILE_WEB 2
English.txt
1 "Open File"
2 "Open the file with the image to Upload to the web site"
Hebrew.txt
1 "×¤×ª× ×§×××¥â"
2 "×¤×ª× ×§×××¥ ש×××¢×× ××תר ×רשתâ"
With the code:
source.cpp
MessageBox(translate("File Dialog","Open File"),
translate("Open the file with the image to Upload to the web
site"));
he.po
msgctx "File Dialog"
msgid "Open File"
msgstr "×¤×ª× ×§×××¥"
msgid "Open the file with the image to Upload to the web site"
msgstr "×¤×ª× ×§×××¥ ש×××¢×× ××תר ×רשת"
Now I hope it is clear now? A constant keys
just create additional indirection level.
So, Never-Never-Never-Never-Never use artificial keys
unless you want to make really bad software and make
your software and translation teams miserable
This is not theoretical question about some general
databases foregin keys, it is very progmatic question
about how to make the localization right.
And yes in early age of software localization
the integer keys could seen as good method,
but nobody works this way today.
Artyom Beilis
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk