Boost logo

Boost :

Subject: Re: [boost] [general] What will string handling in C++ look like in the future
From: Peter Dimov (pdimov_at_[hidden])
Date: 2011-01-21 05:21:41


Beman Dawes wrote:

> Why not just use Boost.Filesystem V3 for dealing with files and filenames?

The V3 path looks very reasonably designed and I can certainly understand
why it's the way it is. However...

Let's take Windows. In the vast majority of the use cases that call for
construction from a narrow string, this string is either (a) ANSI code page
encoded, (b) UTF-8 encoded. Of these, (b) are people doing the Right Thing,
(a) are people doing the Wrong Thing or people who have to work with people
doing the Wrong Thing (not that there's anything wrong with that).

v3::path has the following constructors:

    path( Source );
    path( Source, codecvt_type const & cvt );

The first one uses std::codecvt<wchar_t, char, mbstate_t> to do the
conversion, which "converts between the native character sets for narrow and
wide characters" according to the standard. In other words, nobody knows for
sure what it does without consulting the source of the STL implementation du
jour, but one might expect it to use the C locale via mbtowc. This is a
reasonable approximation of what we need (to convert between ANSI and wide)
but pedants wouldn't consider it portable or reliable. It's also implicit -
so it makes it easy for people to do the wrong thing.

The second one allows me to use an arbitrary encoding, which is good in that
I could pass it an utf8_codecvt or ansi_codecvt, if I find some buggy
versions on the Web or write them myself. But, since it considers all
encodings equally valid, it makes it hard for people to do the right thing.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk