Boost logo

Boost :

Subject: Re: [boost] [review] Review of Nowide (Unicode) starts today
From: Peter Dimov (lists_at_[hidden])
Date: 2017-06-12 18:51:55


Yakov Galka wrote:

> Yeah, so? I say that the library can provide a Windows -> std::string ->
> Windows roundtrip just as it does with any other platform. If FreeBSD ->
> std::string conversion can return invalid UTF-8, then so does Windows ->
> std::string conversion.

The security concern here is that under FreeBSD the file name is what it is,
and different byte sequences refer to different names, whereas under Windows
if invalid UTF-8 is allowed many different byte sequences may map to the
same file name.

This does not necessarily preclude handling free surrogate pairs though. In
practice the main problem is probably with overlong encoding of '.', '/' and
'\'.

Last time this came up I argued that if you rely on finding '.' as the
literal 8 bit '.' your input validation is wrong anyway, but requiring
strictly valid UTF-8 is a reasonable first line of defense.

And realistically, if you want to validate the input in the correct manner,
you have to #ifdef for each OS anyway, which kind of makes the library
redundant. So in the specific use case where you _do_ use the library to
avoid #ifdef'ing, it does make sense for it to protect you from invalid
UTF-8 on Windows.

With all that said, I don't quite see the concern with WTF-8. What's the
attack we're defending from by disallowing it?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk