Boost logo

Boost :

Subject: [boost] [filesystem] Partial fix for POSIX locale("") problems; tickets 4688, 5100, 5289
From: Beman Dawes (bdawes_at_[hidden])
Date: 2011-07-02 11:33:03


AFAIK, the problems discussed here are non-issues for Windows and Mac
OS X, so you can ignore this post if you only care about those two
operating systems.

http://svn.boost.org/trac/boost/changeset/72855 fixes the problem of
misconfigured POSIX system throwing an exception before main() has
started. See:

http://svn.boost.org/trac/boost/ticket/4688
http://svn.boost.org/trac/boost/ticket/5100
http://svn.boost.org/trac/boost/ticket/5289

There are at least two remaining problems:

* There may be other BSD based operating systems that like Mac OS X
should not even try to use std::locale("") to get the codecvt facet
for conversion between wide and narrow strings.

If you know of such a system (FreeBSD? Solaris? Others?), please let
me know (1) the system, (2) the predefined macro to use to identify
the system, and (3) what codecvt facet should be used instead.

* For systems such as Linux, the code is currently still assuming that
if std::locale("") throws, the best action is to let that exception
propagate up the call stack.

One of the tickets [5289] supplies a patch that tries various
alternatives, and eventually falls back on the Boost UTF-8 codecvt.
See below.

It strikes me that such fallbacks are better left to user code. User
code can always try {path(L"foo");} and then imbue their preferred
fallback facet. What do others think?

--Beman

+ // Get default OS encoding
+ char const *plang = getenv("LC_CTYPE");
+ if(!plang)
+ plang = getenv("LC_ALL");
+ if(!plang)
+ plang = getenv("LANG");
+ if(!plang)
+ return global_loc;
+
+ std::string lang = plang;
+ size_t charset_start = lang.find('.');
+ if(charset_start == std::string::npos)
+ return global_loc;
+ charset_start ++;
+ size_t end_of_charset = lang.find(charset_start,'@');
+ std::string encoding = lang.substr(charset_start,end_of_charset -
charset_start);
+ for(size_t i=0;i<encoding.size();i++) {
+ if('a' <= encoding[i] && encoding[i]<='z') {
+ encoding[i]=encoding[i]-'a' + 'A';
+ }
+ }
+ //
+ // Support at least the most popular and widely used encoding
+ //
+ if(encoding!="UTF-8" && encoding!="UTF8")
+ return global_loc;
+ std::locale loc(global_loc, new
boost::filesystem::detail::utf8_codecvt_facet);
+ return loc;


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk