Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2018-04-06 21:54:42

Next message: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #9758: boost::geometry::index::rtree does not compile with CoordinateSystem geographic<"
Previous message: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #13326: linking with program_options has unresolved symbols on MSVC"
In reply to: Boost C++ Libraries: "[Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding"
Next in thread: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #13402: Log format JUNIT generates invalid XML files with incorrect encoding"

#13402: Log format JUNIT generates invalid XML files with incorrect encoding
-------------------------------+-------------------------------
  Reporter: gallien@â€¦ | Owner: Gennadiy Rozental
      Type: Bugs | Status: new
Milestone: To Be Determined | Component: test
   Version: Boost 1.66.0 | Severity: Problem
Resolution: | Keywords:
-------------------------------+-------------------------------

Comment (by Raffi Enficiaud):

I do not well understand why you need to escape the Ã– if your file is
encoded in UTF-8. So, I will ask dumb questions until I get it right.

From this table http://www.utf8-chartable.de/ the correct utf-8 for Ã– /
U+00D6 is the sequence of bytes "0xc3 0x96".

What about transforming your string to either

* `BOOST_TEST("Ã–lniveau" == oelniveau);` as you are saying your files are
written in UTF-8
* or `BOOST_TEST("\xc3\x96lniveau" == oelniveau);`

My gut feeling is that the preprocessor does something with the octal
representation of Ã–.
`0xD6 0x6C 0x6E 0x69` seems to mean

* `0xD6` missing the following `00` for the Ã–
* `0x6C` for the `l` of `Ã–lniveau`
* `0x6E` for the `n` of `Ã–lniveau`
* `0x69` for the `i` of `Ã–lniveau`

The other possibility is that the file that is opened for the JUNIT output
interprets stuff based on the locale. Would you mind checking also
changing the locale like this

{{{
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
}}}

and rerun the check?

Thanks

-- 
Ticket URL: <https://svn.boost.org/trac10/ticket/13402#comment:11>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

Date view	Thread view	Subject view	Author view

This archive was generated by hypermail 2.1.7 : 2018-04-06 22:03:09 UTC