From: Russell Hind (rh_gmane_at_[hidden])
Date: 2004-10-21 03:02:36
Robert Ramey wrote:
> It would seem that you're using the xml archive for purporses other than for
> serialization. Of course I don't see any problem with this (until one
> decides to edit it and change its schema). But I am curious what use you've
> found fot it. I originally did it only to satisfy boost nit-pickers as I
> felt it was an inefficient way to implement serialization. I've since found
> it useful for debugging archives. I seems to be compatile with xml viewers
> so its useful for rendering archives in a visible way. So, after all I have
> to concede that the nit-picker do have a point. I have a sneaking suspicion
> that it will turn up in all kinds of unexpected places and I'm wonder what
> those might be.
I've been using our in-house implemented serialiazation stuff for a few
years which offsers similar functionality to yours. Unforunately ours
was very geard towards quickly dealing with large (>1Gb) files that have
100,000's pointer-based objects stored in them so was tied to a specific
The systems we are dealing at the moment only generates smaller files
(20Mb or so) so boost::serialization will hopefully support it nicely.
It also gives the advantage of XML/text archives as well as binary.
We have an R&D group who only use python for testing purposes and want
to read in our data files for extra processing and trying out new ides.
Binary files are by far the most efficient, but describing the
structure of a binary archive to someone who only uses python isn't easy
at all. So XML seems like the way to go as they can visually look at it
and see the information they want to pick out easily.
Our data consists of many settings, 3d model information uses comments
etc, all which are textual so XML/text supports them well, but three
quarters of the data is vectors floating point scan data. Writing these
textually would lead to an over-top archive. Complete binary would mean
passing it to python users would be a pain, so XML with encoding seems
like a good solution.
When the files get bigger, we can put them through a zip because the
python lot could still handle un-zipping and then reading xml so that
isn't an issue.
If it wasn't for the need to let our R&D group have access to data in
this way, then I would go for a binary format but I'm hoping that
ultimately zipped XML won't be a lot larger for our files (hoping to
test in the next few days).
The urgency of getting serialization up and running is that I've shyed
away from introducing our serialization stuff in to the project and
generating files in its format because I was hoping that boost
serialization would be out in time (we ship in December) and could move
to that as it is a much more flexible system than our in house one.
> you have a couple of options:
> a) Make your own derivation of xml_(i/o)archive which uses your own version
> of write/read_binary. Advantage - wouldn't touch the current archive
> classes. The manual describes how to do this.
> b) Just fix the current code that does the read/write_binary text data.
> You could roll this in to your own version of 1.32 and be on your way. This
> is implemented as part of the dataflow iterators and I don't think this is
> very difficult except that that understanding my dataflow iterator idea
> would take some investment of effort that might not be worthwhile. There is
> already a test for serialization of binary data so even that is done. The
> reason I don't do it now is that it starts a whole chain reaction regarding
> testing on all the platforms that boost supports and it is a very
> inconvenient time to do this. Also no one raised the issue until now.
Fixing the current code would be my ideal solution, I'll just have to
see how much time I get to look in to this. If not, for now, I'm sure
the python lot can handle adding the necessary padding characters in.
I take it the archive version will be increased for the next release if
something like this changes so current files will be compatible?
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk