Boost logo

Boost Users :

Subject: Re: [Boost-users] [Boost Serialization] Cannot handle UTF-8 BOM bytesin XML file
From: Robert Ramey (ramey_at_[hidden])
Date: 2011-04-13 01:44:46

The serialization library uses a code_convert facet to generate utf-8 from wchar_t.

I don't know about the BOM bytes. Sounds like this would require
an enhanement to the xml_warchive and/or text_warchive implementation.
Feel free to submit a suggested patch to the track system.

Robert Ramey

Tijmen van Voorthuijsen wrote:
> Hi,
> I am using boost::archive::xml_woarchive to create XML files under
> Windows, Visual Studio 2008, and in wide character mode. The
> boost::archive::xml_woarchive does not write the UTF-8 three BOM
> bytes to the file and from
> I understand that this
> is all right since it is optional and even not recommended.
> Problems start when I want to edit the file in for example XML
> Notepad which adds the three BOM bytes when saving. Under Windows
> this seems normal behavior. Then parsing the XML file throws an
> exception through the boost::archive::xml_wiarchive.
> My question/recommendation:
> - Why can't the boost::archive::xml_serialization library
> not cope with the UTF-8 BOM bytes?
> - I would recommend that the library can handle XML UTF-8
> files, with and without the three BOM bytes. Both are in fact valid
> UTF-8 XML files.
> I now check for the BOM bytes myself before I parse the ifstream in
> boost::xml_serialization and that works fine.
> Many thanks for your answer.
> Tijmen van Voorthuijsen
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]

Boost-users list run by williamkempf at, kalb at, bjorn.karlsson at, gregod at, wekempf at