Boost logo

Boost :

Subject: Re: [boost] An invalid XML character (Unicode: 0x8) problem because of property_tree::xml_parser::write_xml
From: Bjorn Reese (breese_at_[hidden])
Date: 2015-03-03 07:18:01


On 03/03/2015 04:11 AM, Rohan Shetty wrote:

> I was expecting write_xml(with "utf-8") to do the escape(e.g < replaced with &lt;) or strip any invalid characters(e.g. anything other than #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF])
> Is this part of the write_xml()?

Please read the documentation:

  "RapidXML does not fully support the XML standard; it is not capable
   of parsing DTDs and therefore cannot do full entity substitution.

   [...]

   Please note that RapidXML does not understand the encoding
   specification. If you pass it a character buffer, it assumes the data
   is already correctly encoded; if you pass it a filename, it will read
   the file using the character conversion of the locale you give it (or
   the global locale if you give it none). This means that, in order to
   parse a UTF-8-encoded XML file into a wptree, you have to supply an
   alternate locale, either directly or by replacing the global one."

http://www.boost.org/doc/html/boost_propertytree/parsers.html


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk