Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2005-03-01 11:45:43


troy d. straszheim wrote:
> Hey all, Robert specifically --
>
> I've got fascinated with all the uses for this serialize() method, and
> I'm kicking around ideas for a couple archive types. The use case is
> an "Event" that occurs in a neutrino detector. This event is very
> big, it contains containers of smart pointers to containers of maps
> of smart
> pointers to... you get the idea. 15k lines of just containers of
> data.

I love hearing this - I always wanted to be associated with particle
physics. LOL

> First notion is a pretty_oarchive. Currently I have people
> implementing operator<<() which calls a member function virtual
> ToStream() (so reference-to-base works) for debugging/hacking
> purposes, and if
> everything has a serialize method anyway, it would be a big savings to
> forget the ToStream() stuff and say
>
> pretty_oarchive(cout) << my_class;
>
> where pretty_oarchive is something like xml_oarchive, but with the
> formatting somehow factored out and modified. I admit that other than
> the header and start/end tags I haven't looked at this too closely
> because I assumed it had been discussed, thought I'd check for
> showstoppers first.

Of course you know that is straight forward - and you have he xml archives
that can be used as examples. If you just want to display the information
and not load it, a bunch of tags like object id, etc. can be suppressed.
Personally I would just just the xml_archive and concentrate my efforts on a
program that displays XML in a convenient and perhaps customizable way. I
suspect you could find or make a suitable program of that nature for free or
for low cost. To re-iterate, I would factor the "pretty display" from the
serialization and make it customizable according to the kind of display
required.

In fact, if I had nothing else to do, and had that much interest, I would
make an enhanced version of xml_archive would output TWO files, a) the
xml_archive and b) an xml_schema which could be used by other programs to
parse the xml_archive. Just random thoughts.

> random_iarchive (written already):

> The other issue is that with such a large event, you'd want to be able
> to verify that it makes the round trip to/from the archive intact.

> Obviously boost::serialization has thorough test suites, but with
> people constantly tinkering around in these classes, I was hoping to
> get a test suite going that does exactly what we're doing. You'd
> like to be able to just
>
> Event written_e, read_e;
> binary_oarchive oa;
> binary_iarchive ia;
> random_iarchive ria;
> ria >> written_e;
> oa << written_e;
> // close, open as input
> ia >> read_e;
>
> assert(written_e == read_e);

> where random_iarchive sets fundamental types to random values, expands
> containers to some random length (random() % MAX_RANDOM_LENGTH), and
> creates a T when it sees shared_ptr<T>. I've written this already.
> Of course you have to write your operator==()'s so that they have
> value semantics, that is they compare what is on the ends of their
> component shared_ptrs, not whether these pointers point to the same
> object or not. The original idea was to set the serialize() method
> and the
> operator==() against one another for verification... but if people
> forget to modify both, there's no way to tell from the test suites, it
> seems. It kind of started as an experiment and now that it is written
> it looks like there's no way to keep users from shooting themselves in
> the foot like this.

I'm not sure I'm convinced of this.

I recommend the following when you make a new archive

a) run the code module for the new archive through Gimple LINT and fixup the
obvious oversights.
b) make a file similar to text_archive.hpp in the test directory for your
new archive - new_archive.hpp
c) modify the Jamfile in the serialization test directory to include your
new archive archive
d) invoke the batch/script file run_archive_test <compiler>
<new_archive.hpp>

This will run all the serialization tests against your new archive. It
takes a while - but its worth it.

I recommend the following when you make a new serializable class.

a) run the code module for the new serializable class through Gimple LINT
and fixup the obvious oversights.
b) using the other tests as a basis, make a new test for your new
serializable class.
c) in the course of this you may have to make additions to your new class
such as operator= or you might not. Perhaps, adding a global
operator=(const T lhs &, const T &rhs) might be added just to the test.
d) add test for your new class to the Jamfile in serialization/test
e) invoke batch/shell script runtest <compiler> to generate a table of all
tests including your new one. These tests will run your new class against
all currently defined archives. This is important as some archives are not
sensitive to some errors. For example, tagged XML can recover from some
errors whereas the more efficient native binary cannot.

Even if you only use just one particular compiler for the application you
ship, I would recommend building and running all tests on at least two
pretty good different compilers. For example, gcc 3.4? and VC 7.1 is a good
combination. This will often uncover subtle ambiguities that would
otherwise linger on for years inflicting programmer pain.

I have to say the one single most important thing I've learned from boost is
that its cheaper to maintain the test suite and build for several compilers
than it is to debug the application. bjam (which DOES drive me crazy) is a
godsend for doing this kind of thing.

> Another problem is that if a class contains vector<shared_ptr<Base> >,
> you'd like to be able to populate this with shared_ptr<Derived>,
> where Derived is randomly selected from the set of classes that
> inherit from
> Base. Since serialization requires these classes to be registered, it
> seemed to me there might be a way to do this. But maybe its all
> overkill.

If you don't find the above sufficient, then its not overkill. As I said
the pain of writing the test is nothing compared to shipping a product with
a bug.

> Anyhow, this random_iarchive exists (except for the Base/Derived
> thing, above), maybe it would make a good tutorial case for custom
> serialization archives, maybe people want to use it for something.
> I'd be more than glad to write up some tutorial material, I'm sure I'd
> get a lot out of it.

As I said, I'm not convinced that the random test data should be part of the
archive class. But I'm certainly pleased that someone finds the
serialization suffiiciently useful and interesting to do stuff like this.
So if you want to polish this up and add it to the Files section on source
forge I think it would be great.

FYI, I'm trying to cut back on the time spent on the boost in general, and
the serialization library in particular. However, I have to confess I'm
sort of a boost addict. (Is there a support group for this?) I've already
added and tested the following in the maincvs.

a) serialization library as a DLL
b) serialization of variant - hmmm - I think that's yours

I have no idea when boot 1.33 will come out - Its not up to me.

In the mean time I'm working on a couple of things at a leisurely pace.

a) more formal documentation.
b) test for serialization of classes implemented in DLLS. This is supported
in the the current code but hasn't been tested - so it likely doesn't work.
c) demo for serialization of classes implemented in DLLS. This will likely
require an enhance ment to extended_type_info in order to include a class
factory functionality similar to COM an CORBA. At this point just a small
enhancement will be required.
d) documentation for archive adaptor, include demo and test
e) memoization_archive - an archive adaptor which does a deep copy using the
serialize templates. This also requires some extra help from
extended_type_info.
f) demo for using serialization as a debug and/or database transaction
rollback/roll forward logger. This requires a small enhancement in the
basic archive classes to permit suppression of object tracking for an
archive.

So, if anyone who wants to take on one of these things would be welcome to
contact me. Note, that the serialization library supports compilers going
back to borland 5.51 and MSVC 6.?, and gcc 2.95 and I'm want to maintain
that. So if anyone wants a piece of this, you'll have to sign up for that
too.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk