Boost logo

Boost :

Subject: [boost] Unicode characters in filenames
From: Tom Kent (lists_at_[hidden])
Date: 2015-08-14 18:47:06


Recently there was a thread that ended up changing the boost guidelines so
that Unicode characters are now allowed in C++ source files.
http://lists.boost.org/Archives/boost/2015/06/223822.php

However, in the 1.59 release, there was a filename that had unicode
characters in it: libs\preprocessor\doc\Appendix A An Introduction to
Preprocessor Metaprogramming.html. Which, HTML encoded, actually looks
like: Appendix%20A%20%C2%A0%20An%20Introduction. Note the %C2%A0 character
(Hex C2A0, Octal: 302240, Windows displays:  )?

Since this seems like a mistake, I've created a pull request for this in
pre-processor. However, it begs the question:

Should we support unicode codepoints for filenames in the boost
distribution?

I would like for this answer to be 'no' as there are still lots of tools
out there that don't correctly handle unicode filenames. However, it is
worth bringing up the discussion. Is there a reason we would want unicode
file names? I would guess that tests uses them (especially the filesystem
tests), however I would also expect that these tests generate the files on
the fly, and that they aren't part of what is distributed.

Thoughts?

Tom Kent


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk