From: troy d. straszheim (troy_at_[hidden])
Date: 2005-10-16 13:47:35
(brought over from fast array serialization thread)
Robert Ramey wrote:
> We will get to that. I'm interested in incorporating your improved testing.
> But I do have one concern. I test with windows platforms including borland
> and msvc. These can be quite different than just testing with gcc and can
> suck up a lot of time. It may not be a big issue here, but it means you'll
> have to be aware not to do anything toooo tricky.
Sure. I had this in mind. The changes involve only reducing
duplicated work. There aren't any tricks there that are platform
specific. Probably things will need some tweaking on platforms I
haven't tested the stuff on, and mileage may vary. BTW, the best I
can get out of it overall is a factor of two speedup (I got a factor
of ~4, but that gain is available in only about half the tests. So
you net about two.)
Of course this kind of reorganizing doesn't address the "real"
underlying MxNxK problem. I think going after that requires a better
understanding of the problem than I have at the moment. For instance,
with a one-line change to the Jamfile you could cut testing time in
half by running dll tests only, if you could establish that any given
dll test succeeds if and only if the corresponding static test
succeeds, which I only guess is the case. Anyhow such tweaking is
very easy to do.
> Since you're interested in this I would suggest making a few new directories
> in your personal boost/libs/serialization tree. I see each of these
> directories having its own Jamfile so we could just invoke runtest from any
> of the test suites just by locating to the desired directory.
> a) old_test - change the current test directory to this
> b) test - the current test with your changes to use the unit_test library.
> You might send me source to one of your changed test to see if I want to
> comment on it before too much effort is invested.
> c) test_compatibility. Included your back compatibility tests
My hope was to avoid fragmenting the testing like this and make the
testing "modes" switchable from the command line. a) - c) can be
accomplished pretty easily in one directory with one Jamfile.
One of the more important goals, it seems to me, is to leverage the
for-all-archives tests (test_array.cpp, test_set.cpp,
test_variant.cpp, etc.) as portability ("portability" as in
cross-platform portability for portable archives, and as in
backwards-compatibility) tests, and to easily reuse these tests for
> d) test_performance - I want to include a few tests to test times for thinks
> like time to serialize different primitives, opening/closing archives, etc.
> This would be similar to the current setup so I could sort of generate a
> table which shows which combinations of features and archives are
> bottlenecks. Its the hope that this would help detect really dumb
> oversights like recreating an xml character translation table for each xml
> character serialized !
I'd also like to see stress testing. As I mentioned in some previous
thread, we're going to be running terabytes of data through this
stuff, and I'm not going to sleep well until we've done it several
times successfully. This one does sound to me like a job for a
separate testing directory.
Anyhow, those changes. They're not polished up, but this will give an
idea of how things work. Download
http://www.resophonic.com/test.tar.gz, untar it in libs/serialization
(delete test/ first).
First, explanation of the changes w.r.t unit tests and how they make
speedups possible, followed by an explanation of the changes for
-- Look at test_simple_class.cpp. test_main() has been converted to
BOOST_AUTO_UNIT_TEST(unique_identifier), and a couple #includes
have been changed. There is a corresponding change of lib in the
Jamfile. That's it. If you look at test_map.cpp, you'll see that
many of these unit tests can go in the same translation unit.
-- Look at test_for_all_archives.cpp. This is where the testing
speedup is. test_for_all_archives.cpp gets built once per archive
type. This technique can bite you, of course, if your compiler
requires too much memory and go to swap. My testing shows the
compiler topping out at about 460M for this test, which I would
think is still smaller than some other parts of boost. At any rate
the file could easily be broken into two. One consequence of
#including everything together was a lot of name collisions in
different test_*.cpp files, each of which I chased down and
resolved by changing names. This could probably have been fixed in
some cases more elegantly with namespaces. See classes
unregistered_polymorphic_base, null_ptr_polymorphic_base, SplitA,
SplitB, TestSharedPtrA, etc.
-- Look at the Jamfile, at test-suite "serialization". There you see
the test_for_all_archives.cpp and a test_for_one_archive.cpp. I
have not checked to see how nicely the testing framework displays
failures inside individual unit tests. I've assumed the
granularity is good. If it isn't the, test_for_all_archives.cpp
business can just be tossed out and the unit tests
compiled/linked/run one at a time, as in the current system.
Notice also the use of rule templates to provide the demo tests
with the exec monitor lib, and the unit tests with the unit test
Now the changes relating to portability testing:
-- Look at test_simple_class.cpp. A reseed() has been added at the
top of the test. tmpnam(NULL) has been changed to
TESTFILE("unique_identifier"), and remove(const char*) has been
changed to finish(const char *).
-- Now look at the top of the Jamfile. The switch --portability turns
on the #define BOOST_SERIALIZATION_TEST_PORTABILITY which affects
the behavior of TESTFILE() and finish(). This (almost) gets you
the ability to test portability in various ways. (There are a few
more changes required, I'll get to them.)
-- Looking at test_tools.cpp, if BOOST_SERIALIZATION_TEST_PORTABILITY
finish() is a no-op
TESTFILE("something") returns a path get_tmpdir()/P/archive-type,
Where P is a path that identifies the compiler, platform, and
boost version. TESTFILE("nvp1"), for example, could return
if --portability is not specified, TESTFILE() works like
tmpnam(NULL) and finish(filename) calls std::remove(filename),
which is the "old" functionality.
In this way, if each of your testing runs points to the same $TMP,
each platform/version/compiler's serialized testing data will be
"overlaid" in a directory structure in such a way that you can
easily walk the $TMP hierarchy comparing checksums of files with
the same name.
-- Look at A.hpp. There are now two A's, one portable, one
nonportable. In other places I've made similar changes to other
classes. The portable version contains only portable types and uses
boost random number generators (maybe we want to nix the
nonportable one completely and put serialization of nonportable
types into their own test somewhere.) std::rand() will of course
generate different numbers on one architecture than on others and
we need all platforms to generate archives containing A's with
exactly the same numbers. (I cannot begin to explain what a thrill
it was, as my testing strategy appeared to be on the rocks, to
discover that the problem was already solved right there in
boost::random.) The reseed() that appears at the top of
test_simple_class.cpp reseeds the boost random rngs.
So those changes get you switchable portability testing. You just
need a utility that walks the hierarchy at $TMP and compares files.
I've been using a perl script, you could just pretty easily code one
up with boost::crc and boost::filesystem. There's a
filesystem-walking routine hanging around in test_tools.cpp.
Some minor stuff that I stumbled across and had to resolve in the
process, and the open issues that come to mind:
-- For platform portability testing, one also has to be careful about
containers on some platforms making more temporary copies of A than
on others. You create as many A's as you're going to insert into
your container, and then insert them one at a time. You can't just
call e.g. mymap.insert(A()); multiple times, as you don't know how
many times A::A() will get called inside that call to insert().
This will get you serialized maps, for instance, where only the
first-inserted entry match. Took a while to track down, but
they're all fixed.
-- Jamfile is revamped per Rene's suggestions using rule templates.
I'm sure there are a couple of toolset requirements that I've
managed to drop, but this should just be putting them back in some
places. Overall I think it's more flexible/maintainable, but of
course it isn't finished.
-- test_class_info_save and test_class_info_load always write their
data to one of these platform/version/compiler directories suitable
for portability testing. Need to do a little housekeeping, or
maybe the whole $TMP/platform/version/compiler stuff is OK for
general use, your call.
-- These changes are all against boost release 1.33.0. Dunno if
things are broken w.r.t the trunk.
-- the test_tools.hpp and test_tools.cpp stuff is messy at the moment.
This should probably just be broken out into a separate lib.
-- I'm not 100% clear on my use of rule templates in the Jamfile.
Somebody might want to take a look at this. Specifically, it isn't
clear to me between which of the three colons <define>WHATEVER
should go, and where toolset::required-something-or-other should
go. I can verify that things work OK for gcc, but I don't have a
windows here to test with.
-- The DEPENDS $(saving-tests) : $(loading-tests) business is still
there. I don't recall if this was deprecated or not.
Well let me know what you think.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk