Boost logo

Boost :

Subject: Re: [boost] [locale] Formal review of Boost.Locale library starts tomorrow
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-08 06:57:56


> - I get the impression that Artyom has been using the library
> extensively and it is better for that experience.

It is actually integrated part of CppCMS project.

For example this wiki (written in C++/CppCMS)

Is localized using Boost.Locale and it uses collation,
date/time formatting message translation and multiple
locales in same process.

So yes, I definitely use it all the time :-).

> Only one worry presented itself to me as I read the docs. Consider the
> following example:
> date_time some_point = period::year * 1995 + period::january +
> period::day*1;
> It looks like a period can be implicitly cast to a date_time. This
> seems dangerous. I feel the start of 1995 should be obtained more
> explicitly through the calendar.

Actually date_time by default initialized to current time. So if you
set a year 1995 it would be today's date and time back into 1995.

I prefer not to use explicit cast as it makes life much easier
for users and also

  period::year * 1995 + period::january + period::day*1

Is a periods set has its own type that are set later together.

Also small notices period::january is alias for period::month*0 and
has different type then period::month.

> The prose of the documentation is very good, and the tutorials do a good
> job of introducing the various aspects.
> I did not look at the reference in detail, but I am more concerned that
> it does not offer the necessary level of detail. Consider for example
> the documentation for normalize
> I went to the reference for this because the tutorial
> (<>) did
> not explain. I see that it performs Unicode normalization, but there is
> no hint as to what value I may pass as the 'n' parameter. Where are
> these documented? This is a rhetorical question; I know the answer is
> "at the top of the page", but a link would be useful.

Strange that Doxygen didn't inserted like to norm_type automatically. I'll
add it and try to look over similar places.

> There are many terms used in the documentation which may not be familiar
> to all readers. e.g. what it means to "fold case" and what the
> definitions of the various normalization forms, collation levels, etc.
> are. Some terms are covered in the glossary, but not all and only
> briefly. There are some links to Wikipedia articles, but not many. I
> suggest that links to other resources such as more Wikipedia articles,
> Unicode Consortium documents, etc., especially in the glossary or in a
> new "references" appendix but also wherever else appropriate (perhaps
> linking via a list in the appendix).

Good point, I'll address it.

> I second the request for previous/next links. It's hard to browse
> without them.

Ok, I'll see how to do this, unfortunately Doxygen does not know
to do them automatically, but at the top you always have
a link upper and you can go forward.

> There are many code examples, most of which involve writing something to
> stdout, but few of them include examples of possible output. It would
> be extremely helpful to add example output (preferably more than one
> example where the output can vary by locale) to more of the examples.

Ok, I'll try to address this where I can.

> Assorted minor issues:
> In "Locale Generation":
> - You don't mention what character delimits multiple "variant" options
> in a locale identifier. Is this even possible? This suggests so, but
> later the boost::local::info::variant function implies that it is not
> possible.

Actually variant support vary by backend. Unix supports
predefined set like @latin or @euro and ICU handles them
well. Windows backends does not recognize them and
ICU provides its own syntax. Like


I'll add these examples.

> In "Conversions"
> - You should mention at least briefly what fold_case and normalize do.

Good point.

> In "Messages formatting":
> - There are four "The source text is not copied." with different
> emphasis. I think the second and fourth should read "The source text is
> copied.". In that same paragraph the formatting is wrong near "Plural
> form translation: Translate a message"; it's missing a newline and
> wrongly indented.

Ok, missed that you are right.

> - The example of multiple message domains defines some hypothetical
> domains, but doesn't use them in the example; it would be clearer to
> connect the two.
Good point. I'll address it.

> - In the full list of xgettext parameters most but not all of the
> function names given are links to their respective docs. Could they all be?

Actually it is something that Doxygen did for me implicitly, the point
is that u16* and u32* are hidden as they mostly not in use. I'll see if
this is possible to fix this.

> In "Localized Text Formatting":
> - You do not explain how to include a literal '{' character in the string.

Good point, forgot this, it is '{{' and '}}'
> - The syntax grammar seems wrong. 'key' seems to be being defined to be
> just one character, I guess there should be a '*'?

Yes should be +

> - "date, time , datetime" has a stray space after "time".

Ok I'll fix this.

> - It says "See as::ftime manipulator.", but the "as::ftime" is not a
> link. Could it be?

I'll add it.

> In "Working with dates, times, timezones and calendars.":
> - Just how general is the calendar format here?

What do you mean?

ICU supports several calendars (Hebrew, Islamic, Japanese, Chinese and some
and their support vary by ICU version.

> In "Using Localization Backends":
> - You discuss on what platforms the standard library backend is useful.
> Where you say "on Linux with GCC or Intel compilers", would it be
> better to say "when using the GCC standard library, for example with
> GCC, the Intel compiler" since I guess e.g. clang (if used with the gcc
> std lib) would work too?

I can't tell, not tested with clang.

But I must say that toooo many standard libraries has broken locale
support. For example GCC's libstd++ supports locales only
on Linux based on libc's (now POSIX but in earlier days non-standard
functions like newlocale/duplocale).

So I would not be surprised that clang's version of libstd++ compiled
with only C/POSIX locales in.

For my experience there are very few standard C++ libraries that do this
well. Even stlport's locales quite broken (see faults of SunStudio
with stlport in tested platforms/compilers list)

> - A comment in an example has stray Doxygen markup: "select \c std
> backend as default", and again later: "since it is not supported by \c std".


> Misc:
> - There was some earlier discussion of whether Boost.Locale should
> provide the ability to load gettext catalogs from other sources than the
> filesystem. Was this resolved? It might warrant a mention in the
> rationale.

Ok this is excellent question.

Few days ago I passed once again on older reviews and found out that
I forgot the request.

It is implemented and documented now in trunk but it
was fully tested only in last nightly build (as part of CppCMS)

So I decided not to take a chance of breaking current release
and not included it in review version.

Basically there is an additional callback of type:

   boost::function<std::vector<char>(std::string const &file_name,std::string
const &encoding)>

In messages_info structure. See:

So a custom file system support can be installed in few
lines of code (example exists in trunk as well)

> > Please explicitly state in your review whether the library should be
> > accepted.
> Yes it should.
> I send this review now since I suspect I will not get around to looking
> at the library in more detail or try it myself. If I do manage that I
> will be sure to report back further.
> For now, thanks very much to Artyom and good luck with the review.
> John Bytheway

Thank you very much for the review and many important


Boost list run by bdawes at, gregod at, cpdaniel at, john at