Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2018-04-06 21:54:42


#13402: Log format JUNIT generates invalid XML files with incorrect encoding
-------------------------------+-------------------------------
  Reporter: gallien@… | Owner: Gennadiy Rozental
      Type: Bugs | Status: new
 Milestone: To Be Determined | Component: test
   Version: Boost 1.66.0 | Severity: Problem
Resolution: | Keywords:
-------------------------------+-------------------------------

Comment (by Raffi Enficiaud):

 I do not well understand why you need to escape the Ö if your file is
 encoded in UTF-8. So, I will ask dumb questions until I get it right.

 From this table http://www.utf8-chartable.de/ the correct utf-8 for Ö /
 U+00D6 is the sequence of bytes "0xc3 0x96".

 What about transforming your string to either

 * `BOOST_TEST("Ölniveau" == oelniveau);` as you are saying your files are
 written in UTF-8
 * or `BOOST_TEST("\xc3\x96lniveau" == oelniveau);`

 My gut feeling is that the preprocessor does something with the octal
 representation of Ö.
 `0xD6 0x6C 0x6E 0x69` seems to mean

 * `0xD6` missing the following `00` for the Ö
 * `0x6C` for the `l` of `Ölniveau`
 * `0x6E` for the `n` of `Ölniveau`
 * `0x69` for the `i` of `Ölniveau`

 The other possibility is that the file that is opened for the JUNIT output
 interprets stuff based on the locale. Would you mind checking also
 changing the locale like this

 {{{
 export LC_ALL=en_US.UTF-8
 export LANG=en_US.UTF-8
 export LANGUAGE=en_US.UTF-8
 }}}

 and rerun the check?

 Thanks

-- 
Ticket URL: <https://svn.boost.org/trac10/ticket/13402#comment:11>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2018-04-06 22:03:09 UTC