Boost :

Date view	Thread view	Subject view	Author view

From: Rainer Deyke (rainerd_at_[hidden])
Date: 2019-09-17 11:27:24

Next message: Peter Dimov: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"
Previous message: Mateusz Loskot: "[GitHub] Checks not listed for Travis CI"
In reply to: Gavin Lambert: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"
Next in thread: Peter Dimov: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"
Reply: Peter Dimov: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"

On 17.09.19 08:32, Gavin Lambert via Boost wrote:
> * On Unixes, argv contains whatever byte sequence the shell/caller put
> there.Â This might be the actual filename on disk (if they used tab
> completion) or it might be something subtly different (if they typed it
> themselves using some kind of IME), or even a binary blob.Â In the first
> two cases, while it is fairly *likely* to be UTF-8 (especially in modern
> systems), it is not guaranteed to be -- the user could be running a
> non-UTF-8 locale, or be accessing a filesystem created by someone who
> was.

Or the user could be running a non-UTF-8 locale, but accessing a
filesystem created by somebody who was using UTF-8 - in which case any
filenames should be in UTF-8, even if the user's locale disagrees.

It is because of this last possibility that I recommend treating all
command-line arguments as UTF-8 on Unix systems, even if running a
non-UTF-8 locale, for all cases where treating them as binary blobs is
impractical. Unix filenames are binary blobs, but the de-facto standard
for interpreting these binary blobs as text is to use UTF-8. How can
two users, running two different locales, share a filesystem? By using
UTF-8 for all filenames, regardless of locale. How should a program
convert command-line arguments into UTF-8 filenames? By assuming that
they are already in UTF-8, because performing any kind of conversion
will cause more problems than it will fix.

-- 
Rainer Deyke (rainerd_at_[hidden])

Next message: Peter Dimov: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"
Previous message: Mateusz Loskot: "[GitHub] Checks not listed for Travis CI"
In reply to: Gavin Lambert: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"
Next in thread: Peter Dimov: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"
Reply: Peter Dimov: "Re: [program_options] Proposal: self-contained, header-only port of Boost Program Options library"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk