Boost logo

Boost :

From: Gavin Lambert (boost_at_[hidden])
Date: 2022-08-16 22:30:38


On 16/08/2022 19:44, Peter Dimov wrote:
> Gavin Lambert wrote:
>> Using wchar_t on Windows is actually the least painful option. (And you don't
>> have to worry about locales and imbuements etc if you never try to convert to
>> not-wchar_t.)
>
> That's only if your program never runs on anything else. For portable code,
> using char and UTF-8 is the least painful option. We have an entire library
> in Boost for this purpose, whose documentation does a reasonable job
> explaining that.

Currently, yes. In theory, though, you could adopt a TCHAR-like
approach where you use wchar_t on Windows and char/char8_t on
not-Windows, selected at compile time. This would avoid all conversions
and just use the native character type of the OS, which would be better.

(Windows has its own version of the invalid characters problem -- it's
legal to have mismatched surrogates in filenames, which work fine as
long as you keep everything in wchar_t UCS-2 and never convert it, but
break if you convert to UTF-8 and back. It's probably less common than
not-UTF-8 non-Windows filenames, though.)

The downside is that you need every single bit of code to either use
this TCHAR type (which in turn means that you need to be able to
recompile everything), or (better) to provide overloads for all possible
underlying types (with the same name, so that the actual code is spelled
the same either way), and some usages may need macros or char_traits
etc. (But then that tends to lead to either code duplication or
over-templating, neither of which is good.)

Ideally, the standard library would have defined such a
platform-specific type alias (notably, not actually a distinct type, so
that existing overloads work), which would have made it easier to build
up libraries around it, or at least encourage writing both overloads.
Or the language would define some kind of compile-time-variant that
permits separate-translation-unit implementation of overloaded types
that have the "same" implementation without header-only templates.
Sadly that hasn't happened yet.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk