Re: [Boost-bugs] [Boost C++ Libraries] #8883: property_tree JSON reader does not parse unicode characters properly

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #8883: property_tree JSON reader does not parse unicode characters properly
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2013-09-04 09:05:41


#8883: property_tree JSON reader does not parse unicode characters properly
----------------------------------+----------------------------------------
  Reporter: Ronny Krueger | Owner: cornedbee
  <rk@…> | Status: new
      Type: Bugs | Component: property_tree
 Milestone: To Be Determined | Severity: Problem
   Version: Boost 1.54.0 | Keywords: property_tree JSON unicode
Resolution: |
----------------------------------+----------------------------------------

Comment (by ecotax@…):

 @Lettort: There is a difference betweeen Unicode, specifying 'ä' maps to
 code point E4, and the various ways to encode this code point in bits or
 bytes. There is UTF-16, encoding this as 00E4 (16 bits, fits in a wide
 char), but also UTF-8, encoding this as two bytes, C3 A4.
 When parsing a /u00E4, the correct way to handle this depends on what
 encoding you want for your string.
 If you have a wide string and expect UTF-16, then yes, you'd expect the
 wide char 00E4.
 If you have a regular string and expect UTF-8, you'd expect the two bytes
 C3 A4.

 The original bug report states that first writing and then reading 'ä',
 the writer (defensively?) writes this using two \u encoded characters,
 each being one byte of the UTF-8 encoding. Regardless if this is the best
 choice or not, you'd want the reader to handle this in such a way that it
 'round-trips' as much as possible, which currently is not the case.

 BTW, For future questions/discussions, I guess a site like
 stackoverflow.com is more appropriate.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/8883#comment:3>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:14 UTC