From: Robert Ramey (ramey_at_[hidden])
Date: 2005-09-20 00:26:57
Basically you're correct on all of this.
Rene Rivera wrote:
> After having run two cycles of tests for Boost with the
> mingw-3_4_2-stlport-5_0 configuration and having it take more than 14
> hours on a 2.2Ghz+1GB machine, most of that in the Boost.Serialization
> library[*]. And reading some of the recent discussion about the desire
> to expand testing to include cross-version compatibility and
> cross-compiler compatibility, and hence having the number of tests
> multiply possibly exponentially. I am seriously concerned that we are
> going in the wrong direction when it comes to structuring tests.
This was the basis of my suggestion that we run a complete set only very
> From looking at the tests for serialization I think we are
> over-testing, and we are past the point of exhausting testing
> resources. Currently this library takes the approach of carpet
> bombing the testing space. The current tests follow this overall
> [feature tests] x [archive types] x [char/wchar] x [DLL/not-DLL]
> Obviously this will never scale.
carpet bombing the test space? - I like the imagery. When I started this
was not a problem. I was happy to beat it to death as I could ( and still
do ) just run the whole suite on my machine overnight when ever I make a
change. However, I agree that we're about at the limit without making some
> My first observation is that it doesn't seem that those axis look like
> independent features to me. That is, for example, the char/wchar
> functionality doesn't depend on the feature getting tested, or at
> least it shouldn't. And I can't imagine the library is structure
> internally in that way. To me it doesn't make sense to test "array"
> saving with each of the 3 archive types since the code for
> serialization of the "array" is the same in all situations. Hence it
> would makes more sense to me to structure the tests as:
> [feature test] x [xml archive type] x [char] x [not-DLL]
> [text archive tests] x [char] x [non-DLL]
> [binary archive tests] x [non-DLL]
> [wchar tests] x [non-DLL]
> [DLL tests]
> Basically it's structured to test specific aspects of the library not
> to test each aspect against each other aspect. Some benefits as I see
This makes a lot of sense - except that in the past it has turned out that
some turn out to be accidently connected. Also sometimes compiler quirks
show up in just some combinations.
> * Reduced number of tests means faster turn around on testing.
> * It's much easier to add tests for other aspects as one only has to
> concentrate on a few tests instead of many likely unrelated aspects.
> * The tests can be expanded to test the aspects more critically. For
> example the DLL tests can be very specific as to what aspect of DLL vs
> non-DLL they test.
Note the DLL version should function identially to the static library
version - so this is an exhaustive test of that fact.
> * It is easier to tell what parts of the library are breaking when the
> tests are specific.
Hmm - that sort of presumes we know what's going to fail ahead of time.
There is another related issue. It seems that the tests are run every
night - even though no changes have been made at all to the serialization
library. In effect, we're using the serializaiton library to test other
changes in boost. The argument you make above can just as well be used to
argue that serialization is on a different dimension than other libraries so
serialization tests shouldn't be re-run just because some other library
So there are a number of things that might be looked into
a) Reduce the combinations of the serializaton tests.
b) Don't use libraries to test other libraries. That is don't re-test one
library (.e.g. serialization) just because some other library that it
depends upon (e.g. mpl) has changed.
c) Define a two separate test Jamfiles -
i) normal test
ii) carpet bombing mode
e) Maybe normal mode can be altered on frequent basis when I just want to
test a new feature. or just one test.
f) Include as part of the installation instructions for users an exhaustive
test mode. That is a user how downloads and installs the package would have
the option of producing the whole test results on his own platform and
sending in his results. This would have a couple of advantages
i) It would be sure that all new platforms are tested
ii) I would ensure that the user has everyting installed correctly
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk