Boost logo

Boost :

Subject: [boost] [locale] Review of the proposed Boost.Locale library
From: Volker Lukas (vlukas_at_[hidden])
Date: 2011-04-16 18:42:48


Hello,

this is my review of the Locale library proposed for inclusion in
Boost.

I vote to accept the proposed Boost.Locale as a Boost library.

- What is your evaluation of the design?
The library is nicely structured into individual components, as far as
I can infer from the documentation and having a glance at some of the
source files.

I like that it is possible to use different backends, utilizing the
large ICU library when available.

- What is your evaluation of the implementation?
I was having only a superficial look at some of the source files, so I
can not comment much. What I have seen looks clean to me. While
reading the source code I think I could benefit from even more code
comments. There are a lot already, but on the other hand there exists
for example a source file libs/locale/src/shared/mo_lambda.cpp
apparently containing a whole state machine or something without any
word how it is used. I also stumbled over an odd comment in file
libs/locale/src/std/numeric.cpp in line 163: "// workaround common bug
- substitute NBSP with ordinary space". As a reader who did not
already write the code I can not infer much from comments such as this
one - I mean "What is the common bug then?" If this comment refers to
the situation that e.g. the locale implementation of GCCs standard
library implementation generates invalid UTF-8 when using NBSP as
thousands seperator, then the comment should say that. Well, my point
about comments is a minor issue of course. I do not suggest that the
author goes over all the code again just to add or change comments.

I am a bit worried over the size of the libraries binary. On GNU/Linux
x86_64 the built libboost_locale.so is almost 10MB large and codesize
according to the "size" utility is almost 1MB. This comes in addition
to the underlying ICU library. Is there any potential for reducing
codesize? Due to the same concern about codesize I have also one
concrete suggestion relating to the comparator template of the
collation component. I suggest to not make the default comparision
level a template parameter, instead to store it as a member. This way,
when instantiating container templates the number of instantiations
can be kept down. Or am I mistaken in thinking this would lead to more
compact code?

- What is your evaluation of the documentation?
It is nicely structured. I could easily find the documentation for any
specific component I was using and the documentation is also very
helpful if you have a task at hand and want to find out which component
to use (and whether such a component is available in the proposed
Boost.Locale).

There are some issues I have:
Phrasing is sometime odd. For example right in the beginning, under
the headline "What is Boost.Locale" there is a sentence which reads
"C++ offers a very good base for localization via the existing C++
locale facets std::num_put, std::ctype, std::collate etc. But these
are very limited and sometimes buggy by design." I can only guess what
that means. Superficially it seems like a contradiction to me,
something which is excellent can not be buggy by design, not even
sometimes. Okay, minor issue again, it just stands out because it is
only the fourth or fifth sentence or so.

There are some typos (I guess) where it is important to be correct:
Under "Messages Formatting (Translation)", subheading "Indirect
Message Translation" all functions are documented as "The source text
is not copied.". From the emphasis in the original document I think I
can conclude that some of these functions copy and some do not.

The reference section of the documentation should document the
requirements on the CharType which many functions/classes are
parameterized upon (can it be char/wchar_t only?).

The reference documentation for token_iterator and break_iterator
should not have individual entries for operator++, operator--,
operator== and other operators which are always required for
iterators, unless the documentation adds information. Instead, it
should just be stated to which iterator concepts these templates
conform.

Documentation for the dereferening operators (operator*) of these
templates states that the underlying sequences iterated over can not
be modified with these operators. As this is the case, could a usage
hint be incorporated stating that users should always use
const_iterators as arguments to these templates? If a user sometimes
employs for example std::string::iterator and sometimes
std::string::const_iterator this would lead to unnecessary code bloat.
Also, operator-> is missing from these templates.

- What is your evaluation of the potential usefulness of the library?
It is potentially very useful. The only objection I could conceive of
is that ICU already provides much of the functionality of
Boost.Locale. This library also duplicates functionality of Matthew
Wilsons excellent Fastformat library
(http://fastformat.sourceforge.net/), though the latter does not
feature localization support as complete as that of the proposed
Boost.Locale.

- Did you try to use the library? With what compiler? Did you have
any problems?
I used the CMake way to build this library on an Opensuse system on
x86_64. Compiler is GCC 4.5.1 (or a Svn snapshot roughly
corresponding). The Boost version built against is 1.46.1, as
delivered by Opensuse. I did not encounter any problems building the
library.

I played around with some of the components of the library and ran the
supplied test cases. There is one inconsistency I noted, the following
program generates output "Currency: $100.11" when the environment
variable LANG is set to "en_US.UTF-8" and variable LC_MONETARY is set
to "de_DE.UTF-8". Shouldn't the library honor LC_MONETARY? If LANG is
set to "de_DE.UTF-8" "Currency: 100,11 €" is put out.
Program:
-------------------------------------------------
#include <ios>
#include <iostream>
#include <ostream>
#include <locale>

#include "boost/locale.hpp"

int main() {
  namespace loc = boost::locale;
  using std::cout;

  loc::generator gen;
  cout.imbue(gen(""));
 
  cout << "Currency: " << loc::as::currency << 100.11 << '\n';
}
-------------------------------------------------

- How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?
Reading the tuturial section of the Documentation. Glancing over the
reference section. I also had a look at some of the source files, but
only superficially.

- Are you knowledgeable about the problem domain?
Only marginally.

With best regards,
Volker Lukas


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk