Boost logo

Boost :

Subject: Re: [boost] An invalid XML character (Unicode: 0x8) problem because of property_tree::xml_parser::write_xml
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2015-03-03 06:58:08


This mailing-list uses bottom- and inline-posting, please lay out your
responses accordingly.

On 03/03/2015 04:11, Rohan Shetty wrote:
> Hi Mathias,
> Thanks for your response.
> I was expecting write_xml(with "utf-8") to do the escape(e.g <
replaced with &lt;) or strip any invalid characters(e.g. anything other
than #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF])
> Is this part of the write_xml()?
> Do let me know if this is not clear.
> Regards,Rohan

It is not reasonable to expect that the write_xml function would
silently drop data by default.
If you want invalid data to be removed, you'll have to do this yourself
prior to calling the function.

This signature of write_xml doesn't actually do anything encoding-wise,
it outputs your data as-is, and marks the data as being the encoding you
specified.

It might be more sensible to set up the encoding correctly though, or to
convert your data to the right encoding.
There is another overload of write_xml that can imbue a locale when
writing the data, which can be used for transparent transcoding.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk