Boost logo

Boost :

Subject: [boost] [filesystem] path thread safety fix impact on POSIX systems
From: Beman Dawes (bdawes_at_[hidden])
Date: 2011-12-30 08:48:59


Class path locale initialization has suffered from a data race for
several releases.
See https://svn.boost.org/trac/boost/ticket/6320 for an example of
code that suffers as a result.

The problem was introduced when locale initialization was changed from
namespace scope initialization to function scope initialization. For
Windows and Mac OS X, the fix is simply to change back to namespace
scope initialization.

For non-BSD based POSIX systems such as Linux, the problem is more
complex. These system need std::locale(""), "the locale-specific
native environment". Considerations:

* std::locale("") will throw if environmental variables are configured
incorrectly. For example, setting LANG=foo on my Ubuntu system causes
std::locale("") to throw.

* std::locale("") is only needed if conversions between wide and
narrow character paths occur in the program, so it would be
unfortunate to have programs throw that don't actually do any such
conversion.

* With GCC, std::locale("") at namespace scope will throw before
main() has started! That prevents catching the exception in the user
code, and was what led to moving the initialization to a function
scope static. Initialization as a function scope static also meant
that the exception only occurred if user code actually performed wide
- narrow conversions.

I can see two possible fixes:

(A) Use function scope locale initialization, using
boost/detail/lightweight_mutex.hpp to prevent data races.

(B) Use namespace scope locale initialization, defaulting the codecvt
facet to UTF-8 if std::locale("") throws.

The advantage of (B) is that path always initializes without throwing,
and that's what users seem to expect. The initialization is correct
for all those whose environments are configured correctly, and for
those uses who want UTF-8 even if their environments are
misconfigured. The POSIX users who prefer an exception on a
misconfigured environment can always add a std::locale("") at the
start of main().

Comments or opinions?

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk