Boost logo

Boost :

Subject: Re: [boost] Boost.Locale and the standard "message" facet
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-05-01 14:21:26


> From: Steve Bush <sb2_at_[hidden]> > > >Use of integer identifiers is the best > >way to screw the localization in the software. > > >What does 3456 means? Do you really think > >it is good to write translate(MY_MESSAGE_OPENING_FILE) > > >No, never - never - never - never - never > >use such "constant" or "integer" identifiers. > > >Always use natural text. > > I am not sure I fully understand this, but definitely I disagree with the > idea that messages should never be integer identifiers. > > There is a principle in the database world that primary keys to records > should be meaningless where the meaning can change over time. Imagine a text > identifier "Close your hatch now" and over time, the very concept of "hatch" > becomes meaningless - yet for all time the source code is condemned to have > the original, now meaningless if not downright confusing identifier "Close > your hatch now". > See notes below. > In the windows world messages are generally numbers and there is > considerable value in being able to search the web/documentation for error > -23756472536476523 instead of some local language string which will only > turn up comments in one language. > Note, same for things like errno and strerror - the error is represented by the code and strerror (usually) converts it to the natural text... However it is not always correct as strerror may add more information to text then just simple key-value lookup. So even in case of error codes the error code is useful for representing a condition for the program while the text itself may be generated in different way. So basically it should be: switch(error_code) { case EINVAL: if(...) return gettext("first parameter is null") else if (...) return gettext("The range is invalid"); } Same works for many other APIs. Consider int status = somesql_prepare_query(conn,"SELECT * FROM fooo"); Now even if the status is SOMESQL_PREPARATION_FAILED the message somesql_strerror(conn) May actually return: "Unknown table `fooo'" The fact that it is used in Windows API and actually it is used by some (legacy) localization systems does not mean that this is the way it should be. See description below. > Any text compiled into a program is essentially a constant exactly like any > integer. Therefore the same rules apply equally to > Not every rule that is applicable to the software applicable for human interface and Natural-Languages. > Generally I think the idea of compiling any meaningful text whatsoever into > object code is questionable from a theoretical basis and is usually just a > hangover from when gettext/translate was a quick and dirty way to largely > automate the localisation of existing mono-lingual programs - by simply > wrapping all quoted text with a call to translate. > No, it is not. Having natural language identifier has following important advantages: 1. It is promised that the meaning of the text and the translation is always synchronized. 2. It makes code much more readable 3. It makes code much more maintainable 4. It makes it easier to detach actual translator from the source code. All modern localization system provide natural language identifiers. And "constants" should never be used for message formatting. And this is not only my opinion by also the opinion of many people who actually deal with localization. Compare the code: source.cpp MessageBox(translate(MSG_OPEN_FILE_TITLE),translate(MSG_OPEN_IMAGE_FILE_WEB)); resource.h #define MSG_OPEN_FILE_TITLE 1 #define MSG_OPEN_IMAGE_FILE_WEB 2 English.txt 1 "Open File" 2 "Open the file with the image to Upload to the web site" Hebrew.txt 1 "פתח קובץ‎" 2 "פתח קובץ שיועלה לאתר ברשת‎" With the code: source.cpp MessageBox(translate("File Dialog","Open File"), translate("Open the file with the image to Upload to the web site")); he.po msgctx "File Dialog" msgid "Open File" msgstr "פתח קובץ" msgid "Open the file with the image to Upload to the web site" msgstr "פתח קובץ שיועלה לאתר ברשת" Now I hope it is clear now? A constant keys just create additional indirection level. So, Never-Never-Never-Never-Never use artificial keys unless you want to make really bad software and make your software and translation teams miserable This is not theoretical question about some general databases foregin keys, it is very progmatic question about how to make the localization right. And yes in early age of software localization the integer keys could seen as good method, but nobody works this way today. Artyom Beilis


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk