Subject: Re: [Boost-bugs] [Boost C++ Libraries] #1678: Boost.property_tree::read_xml does not parse UNICODE file with BOMs
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2011-02-18 17:28:32
#1678: Boost.property_tree::read_xml does not parse UNICODE file with BOMs
--------------------------------------+-------------------------------------
Reporter: tom | Owner: cornedbee
Type: Patches | Status: assigned
Milestone: Boost 1.47.0 | Component: property_tree
Version: Boost Development Trunk | Severity: Showstopper
Resolution: | Keywords: property_tree UNICODE BOM read_xml
--------------------------------------+-------------------------------------
Comment (by marshall):
I don't think this is right - especially in the case of UTF-16 (and
UTF-32)
For UTF-16:
* If there is a BOM, and it is "FE FF", then the rest of the UTF-16 must
be interpreted as "little endian"
* If there is a BOM, and it is "FF FE", then the rest of the UTF-16 must
be interpreted as "big endian".
I see no code here to do that. It just notes "Hey, there's a BOM here"
(assuming that the BOM matches the endianness of the processors that is
consuming the XML), and continues on.
See http://www.opentag.com/xfaq_enc.htm for more info.
-- Ticket URL: <https://svn.boost.org/trac/boost/ticket/1678#comment:5> Boost C++ Libraries <http://www.boost.org/> Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:05 UTC