Boost logo

Boost :

Subject: Re: [boost] [filesystem] path thread safety fix impact on POSIX systems
From: Alf P. Steinbach (alf.p.steinbach+usenet_at_[hidden])
Date: 2011-12-30 10:03:12


On 30.12.2011 14:48, Beman Dawes wrote:
> Class path locale initialization has suffered from a data race for
> several releases.
> See https://svn.boost.org/trac/boost/ticket/6320 for an example of
> code that suffers as a result.
>
> The problem was introduced when locale initialization was changed from
> namespace scope initialization to function scope initialization. For
> Windows and Mac OS X, the fix is simply to change back to namespace
> scope initialization.
>
> For non-BSD based POSIX systems such as Linux, the problem is more
> complex. These system need std::locale(""), "the locale-specific
> native environment". Considerations:
>
> * std::locale("") will throw if environmental variables are configured
> incorrectly. For example, setting LANG=foo on my Ubuntu system causes
> std::locale("") to throw.
>
> * std::locale("") is only needed if conversions between wide and
> narrow character paths occur in the program, so it would be
> unfortunate to have programs throw that don't actually do any such
> conversion.
>
> * With GCC, std::locale("") at namespace scope will throw before
> main() has started! That prevents catching the exception in the user
> code, and was what led to moving the initialization to a function
> scope static. Initialization as a function scope static also meant
> that the exception only occurred if user code actually performed wide
> - narrow conversions.
>
> I can see two possible fixes:
>
> (A) Use function scope locale initialization, using
> boost/detail/lightweight_mutex.hpp to prevent data races.
>
> (B) Use namespace scope locale initialization, defaulting the codecvt
> facet to UTF-8 if std::locale("") throws.
>
> The advantage of (B) is that path always initializes without throwing,
> and that's what users seem to expect. The initialization is correct
> for all those whose environments are configured correctly, and for
> those uses who want UTF-8 even if their environments are
> misconfigured. The POSIX users who prefer an exception on a
> misconfigured environment can always add a std::locale("") at the
> start of main().

The problem with solution (B) is IMHO not that it lies, but that it
/covers up/ a problem. The problem -- misconfiguration -- is still there
but the user is made unaware of it. That's ungood.

So I would favor (A). The problem with that is then efficiency, or
perceived inefficiency. But so what.

I say, go for correctness, and don't fret about the nano-efficiency. It
could be different if the question was about some new clean thing, then
it would warrant some redesign (mutable globals in the age of
multi-processing isn't that bright an idea, really). But for just
supporting the old unclean stuff -- don't fret about the nano-efficiency.

Cheers,

- Alf


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk