Boost logo

Boost Users :

Subject: Re: [Boost-users] How does boost.serialization do with BOM intext/xmlfiles
From: Robert Ramey (ramey_at_[hidden])
Date: 2008-09-05 02:46:03


This is news to me.

the wide character text/xml archives use UTF-8. They do this
by creating a stream with the uft_codecvt_facet. I used
this factet, it worked great and I moved on. So you're way
ahead of me on this.

This would probably be easy to address in the xml_iarchive code
or perhaps the xml_grammar - but, as I said, I don't know
anything about it.

Robert Ramey

Tan, Tom (Shanghai) wrote:
>> what is BOM?
>
>> Probably "Byte Order Mark", see
> http://en.wikipedia.org/wiki/Byte-order_mark
>
> Yes, That's what I meant.
>
> I was testing the demo_xml_load.cpp and demo_xml_save.cpp available
> in the boost.serialization example.
> By simply opening demo_save.xml produced by demo_xml_save.exe with XML
> copy editor(http://xml-copy-editor.sourceforge.net/) and saving it
> back, demo_xml_load.exe would crash. I compared the two files with
> Winmerge. It said it's identical.
>
> by studying the hex view, I later found it's because the 3-byte UTF-8
> BOM was inserted to the beginning of file. It would not change the
> data, and in many cases was ignored by the text editors.
>
> I thinking that Boost.serialization should also handle this for all
> text files including XML.
>
> Tom


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net