Boost logo

Boost :

Subject: Re: [boost] [locale] Formal review of Boost.Locale library starts tomorrow
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-08 06:57:56


>

> - I get the impression that Artyom has been using the library
> extensively and it is better for that experience.
>

It is actually integrated part of CppCMS project.

For example this wiki (written in C++/CppCMS)

  http://art-blog.no-ip.info/wikipp/en/page/main

Is localized using Boost.Locale and it uses collation,
date/time formatting message translation and multiple
locales in same process.

So yes, I definitely use it all the time :-).

 
> Only one worry presented itself to me as I read the docs. Consider the
> following example:
>
> date_time some_point = period::year * 1995 + period::january +
> period::day*1;
>
> It looks like a period can be implicitly cast to a date_time. This
> seems dangerous. I feel the start of 1995 should be obtained more
> explicitly through the calendar.
>

Actually date_time by default initialized to current time. So if you
set a year 1995 it would be today's date and time back into 1995.

I prefer not to use explicit cast as it makes life much easier
for users and also

  period::year * 1995 + period::january + period::day*1

Is a periods set has its own type that are set later together.

Also small notices period::january is alias for period::month*0 and
has different type then period::month.

> The prose of the documentation is very good, and the tutorials do a good
> job of introducing the various aspects.
>
> I did not look at the reference in detail, but I am more concerned that
> it does not offer the necessary level of detail. Consider for example
> the documentation for normalize
>(<http://cppcms.sourceforge.net/boost_locale/html/group__convert.html#ga11672e3cc3ed7eecf1dd07060265aab2>).
>.
> I went to the reference for this because the tutorial
> (<http://cppcms.sourceforge.net/boost_locale/html/conversions.html>) did
> not explain. I see that it performs Unicode normalization, but there is
> no hint as to what value I may pass as the 'n' parameter. Where are
> these documented? This is a rhetorical question; I know the answer is
> "at the top of the page", but a link would be useful.
>

Strange that Doxygen didn't inserted like to norm_type automatically. I'll
add it and try to look over similar places.

> There are many terms used in the documentation which may not be familiar
> to all readers. e.g. what it means to "fold case" and what the
> definitions of the various normalization forms, collation levels, etc.
> are. Some terms are covered in the glossary, but not all and only
> briefly. There are some links to Wikipedia articles, but not many. I
> suggest that links to other resources such as more Wikipedia articles,
> Unicode Consortium documents, etc., especially in the glossary or in a
> new "references" appendix but also wherever else appropriate (perhaps
> linking via a list in the appendix).
>

Good point, I'll address it.

> I second the request for previous/next links. It's hard to browse
> without them.
>

Ok, I'll see how to do this, unfortunately Doxygen does not know
to do them automatically, but at the top you always have
a link upper and you can go forward.

> There are many code examples, most of which involve writing something to
> stdout, but few of them include examples of possible output. It would
> be extremely helpful to add example output (preferably more than one
> example where the output can vary by locale) to more of the examples.
>

Ok, I'll try to address this where I can.

> Assorted minor issues:
>
> In "Locale Generation":
>
> - You don't mention what character delimits multiple "variant" options
> in a locale identifier. Is this even possible? This suggests so, but
> later the boost::local::info::variant function implies that it is not
> possible.
>

Actually variant support vary by backend. Unix supports
predefined set like @latin or @euro and ICU handles them
well. Windows backends does not recognize them and
ICU provides its own syntax. Like

   calendar=hebrew;collation=phonetic

I'll add these examples.

> In "Conversions"
>
> - You should mention at least briefly what fold_case and normalize do.

Good point.

>
> In "Messages formatting":
>
> - There are four "The source text is not copied." with different
> emphasis. I think the second and fourth should read "The source text is
> copied.". In that same paragraph the formatting is wrong near "Plural
> form translation: Translate a message"; it's missing a newline and
> wrongly indented.
>

Ok, missed that you are right.

> - The example of multiple message domains defines some hypothetical
> domains, but doesn't use them in the example; it would be clearer to
> connect the two.
>
Good point. I'll address it.

> - In the full list of xgettext parameters most but not all of the
> function names given are links to their respective docs. Could they all be?

Actually it is something that Doxygen did for me implicitly, the point
is that u16* and u32* are hidden as they mostly not in use. I'll see if
this is possible to fix this.

>
> In "Localized Text Formatting":
>
> - You do not explain how to include a literal '{' character in the string.

Good point, forgot this, it is '{{' and '}}'
>
> - The syntax grammar seems wrong. 'key' seems to be being defined to be
> just one character, I guess there should be a '*'?
>

Yes should be +

> - "date, time , datetime" has a stray space after "time".
>

Ok I'll fix this.

> - It says "See as::ftime manipulator.", but the "as::ftime" is not a
> link. Could it be?

I'll add it.

>
> In "Working with dates, times, timezones and calendars.":
>
> - Just how general is the calendar format here?

What do you mean?

ICU supports several calendars (Hebrew, Islamic, Japanese, Chinese and some
more)
and their support vary by ICU version.

>
> In "Using Localization Backends":
>
> - You discuss on what platforms the standard library backend is useful.
> Where you say "on Linux with GCC or Intel compilers", would it be
> better to say "when using the GCC standard library, for example with
> GCC, the Intel compiler" since I guess e.g. clang (if used with the gcc
> std lib) would work too?
>

I can't tell, not tested with clang.

But I must say that toooo many standard libraries has broken locale
support. For example GCC's libstd++ supports locales only
on Linux based on libc's (now POSIX but in earlier days non-standard
functions like newlocale/duplocale).

So I would not be surprised that clang's version of libstd++ compiled
with only C/POSIX locales in.

For my experience there are very few standard C++ libraries that do this
well. Even stlport's locales quite broken (see faults of SunStudio
with stlport in tested platforms/compilers list)

> - A comment in an example has stray Doxygen markup: "select \c std
> backend as default", and again later: "since it is not supported by \c std".
>

Noted.

> Misc:
>
> - There was some earlier discussion of whether Boost.Locale should
> provide the ability to load gettext catalogs from other sources than the
> filesystem. Was this resolved? It might warrant a mention in the
> rationale.

Ok this is excellent question.

Few days ago I passed once again on older reviews and found out that
I forgot the request.

It is implemented and documented now in trunk but it
was fully tested only in last nightly build (as part of CppCMS)

  http://art-blog.no-ip.info/files/nightly-build-report.html

So I decided not to take a chance of breaking current release
and not included it in review version.

Basically there is an additional callback of type:

   boost::function<std::vector<char>(std::string const &file_name,std::string
const &encoding)>

In messages_info structure. See:

  http://cppcms.sourceforge.net/boost_locale/html/gnu__gettext_8hpp_source.html

So a custom file system support can be installed in few
lines of code (example exists in trunk as well)

>
> > Please explicitly state in your review whether the library should be
> > accepted.
>
> Yes it should.
>
> I send this review now since I suspect I will not get around to looking
> at the library in more detail or try it myself. If I do manage that I
> will be sure to report back further.
>
> For now, thanks very much to Artyom and good luck with the review.
>
> John Bytheway
>

Thank you very much for the review and many important
inputs.

Artyom


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk