Boost logo

Boost :

Subject: Re: [boost] [review] Review of Nowide (Unicode) starts today
From: Artyom Beilis (artyom.beilis_at_[hidden])
Date: 2017-06-12 19:37:34


> But maybe I missed something here. If there really is a good reason for enforcing valid UTF-8 in some situation, please let me know :)
>

Ok I want to summarize my response regarding WTF-8 especially from
security point of view.

There is lots of dangers in this assumption - generation of WTF-8 as
if it was UTF-8 can lead to serious issues.

It almost never OK to generate invalid UTF-8 especially since most of
chances 99% of users will not understand
what are we talking about round-trip/WTF-8 and so - and I'm talking
from experience.

Just to make clear how serious it can be - this is CVE regarding one
bug in Boost.Locale:

    http://people.canonical.com/~ubuntu-security/cve/2013/CVE-2013-0252.html

Lets show a trivial example that WTF-8 can lead to:

Tool:

    - User monitoring system that monitors all files and creates
report with all changes by XML to some central server

Deny of Service Attack Example:

    - User creates a file with invalid UTF-16
    - System monitors the file system and adds it to the XML report in
WTF-8 format
    - The central server does not accept the XML since it fails UTF-8 validation
    - User does whatever he wants without monitoring
    - It removes the file
    - There were no reports generated during the period user needed -DOS attack

Bottom line:

(a) Since 99% of programmers are barely aware of various Unicode
issues it is dangerous to assume that giving such a round trip in
trivial way is OK

(b) If you need to do complex manipulations of file system you my
consider Boost.Filesystem that keeps internal representation in native
format and convert
    to UTF-8 for other uses (nowide provides integration with Boost.Filesystem)

(c) It is possible to add WTF-8 support under following restrictions:
(c.1) only by adding wtf8_to_wide and wide_to_wtf8 - if user wants it
explicitly(!)
(c.2) No "C/C++" like API should accept one - i.e.
nowide::fopen/nowide::fstream must not do it.

Artyom Beilis


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk