Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2018-04-06 17:56:11


#13402: Log format JUNIT generates invalid XML files with incorrect encoding
-------------------------------+-------------------------------
  Reporter: gallien@… | Owner: Gennadiy Rozental
      Type: Bugs | Status: new
 Milestone: To Be Determined | Component: test
   Version: Boost 1.66.0 | Severity: Problem
Resolution: | Keywords:
-------------------------------+-------------------------------

Comment (by sebastian.freitag@…):

 I just found this ticket after experiencing the same issue.

 tl;dr summary: boost test writes one-byte characters into junit xml output
 that are not supposed to exist in utf-8. for example the german umlaut Ö
 is 0x00D6 in UTF-8 but gets written as 0xD6 into the file. Only one-byte
 character values < 128 are valid 1-byte UTF-8 sequences.

 How I found it:

 One of my tests is doing the following comparison:
 {{{
 // oelniveau is std::string, previously read from a windows-1252 encoded
 textfile
 // Ö is escaped here as \326 because our source code file is UTF-8
 // and comparing the Ö string literal, in UTF-8, with the variable will
 // fail even when it is supposed to pass.
 BOOST_TEST("\326lniveau" == oelniveau);
 }}}

 The JUNIT output then contains something like this (when I let it fail on
 purpose by putting "something" into the variable):
 {{{
 ASSERTION FAILURE:
 (...)
 - message: check "\326lniveau" == oelniveau has failed [?lniveau !=
 something]
 (...)
 }}}

 Here opened in an editor "as utf-8". The ? shows that the xml file has a
 character for the Ö that will not pass as a valid UTF-8 sequence.
 And xmllint complains about the file:
 {{{
 result.xml: parser error : Input is not proper UTF-8, indicate encoding !
 Bytes: 0xD6 0x6C 0x6E 0x69
 }}}
 And a typical junit plugin from jenkins complains as well:
 {{{
 com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
 Invalid byte 1 of 1-byte UTF-8 sequence.
 }}}

-- 
Ticket URL: <https://svn.boost.org/trac10/ticket/13402#comment:9>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2018-04-06 18:03:56 UTC