|
Boost Testing : |
Subject: Re: [Boost-testing] Comparing different runs of regression tests
From: Marshall Clow (mclow.lists_at_[hidden])
Date: 2013-07-02 14:59:11
On Jul 2, 2013, at 7:14 AM, Ben Pope <benpope81_at_[hidden]> wrote:
> On 27/06/13 03:13, Marshall Clow wrote:
>> Over the last few weeks, I've been collecting the logs of my regression tester (the XML files that get uploaded).
>> They include several slightly different configurations:
>>
>> darwin gcc 4.2.1 + libstdc++ compiling for C++03
>> clang-darwin Apple-released clang + libstdc++ compiling for C++03
>> clang-darwin-tot Current "tip-of-tree" clang + libstdc++ compiling for C++03
>> clang-darwin-11 Apple-released clang + libc++ compiling for C++11
>> clang-darwin-tot11 Current "tip-of-tree" clang + libc++ compiling for C++11
>> clang-darwin-asan Current "tip-of-tree" clang + libstdc++ compiling for C++03 using Address Sanitizer
>> clang-darwin-asan11 Current "tip-of-tree" clang + libc++ compiling for C++11 using Address Sanitizer
>>
>> Having this data, I started to wonder.
>> * What are the differences in the results for two different days?
>> Example: What changed in the test results between Tuesday and Wednesday?
>> * What are the differences in the results between two different configurations?
>> Example: What differences are there between using gcc and clang?
>> Example: What differences are there between C++03 vs. C++11?
>> Example: What differences are there between "released clang" and "tot-clang"?
>> Example: What differences are there when you turn on Address Sanitizer?
>>
>> I've written some python scripts to help answer these questions.
>>
>> Is this kind of information interesting to anyone besides me?
>
> Yes this is interesting.
>
> I'm just about to define BOOST_THREAD_VERSION=4 for my test runners and it would be interesting to see if there is a difference in test results.
Ok. I have two tools. Once compares two runs (all toolsets)
Example output:
$ ~/bin/boostLog.py marshall-mac-trunk-0701.xml marshall-mac-trunk-0702.xml
Source: trunk (revision 84896)
Date: 2013-06-24T05:05:35Z
171 tests have empty toolset names
554 tests have empty test names
Test count by toolset (30680 total tests)
171 (19 failed)
6100 (276 failed) clang-darwin-11
6102 (137 failed) clang-darwin-4.2.1
6105 (144 failed) darwin-4.2.1
6100 (417 failed) clang-darwin-asan
6102 (138 failed) clang-darwin-tot
Source: trunk (revision 84896)
Date: 2013-06-24T05:05:35Z
171 tests have empty toolset names
554 tests have empty test names
Test count by toolset (30680 total tests)
171 (19 failed)
6100 (276 failed) clang-darwin-11
6102 (137 failed) clang-darwin-4.2.1
6105 (144 failed) darwin-4.2.1
6100 (417 failed) clang-darwin-asan
6102 (138 failed) clang-darwin-tot
New failing tests (0)
New Passing tests (0)
The other one compares two configs within the same run:
mclow$ ~/bin/b_logcompare.py marshall-mac-trunk-0702.xml darwin-4.2.1 clang-darwin-4.2.1
darwin-4.2.1 has 6105 tests
clang-darwin-4.2.1 has 6102 tests
darwin-4.2.1 has 5990 tests with unique names
clang-darwin-4.2.1 has 5990 tests with unique names
Tests that failed in clang-darwin-4.2.1 but passed in darwin-4.2.1 (55)
build:prebuilt
math:hypot_test
math:log1p_expm1_test
math:powm1_sqrtp1m1_test
math:test_airy
math:test_bessel_i
math:test_bessel_j
math:test_bessel_k
math:test_bessel_y
math:test_beta
math:test_carlson
math:test_cbrt
math:test_digamma
math:test_ellint_1
math:test_ellint_2
math:test_ellint_3
math:test_erf
math:test_expint
math:test_gamma
math:test_hermite
math:test_ibeta_double
math:test_ibeta_float
math:test_ibeta_inv_ab_double
math:test_ibeta_inv_ab_float
math:test_ibeta_inv_ab_long_double
math:test_ibeta_inv_ab_real_concept1
math:test_ibeta_inv_ab_real_concept2
math:test_ibeta_inv_ab_real_concept3
math:test_ibeta_inv_double
math:test_ibeta_inv_float
math:test_ibeta_inv_long_double
math:test_ibeta_inv_real_concept1
math:test_ibeta_inv_real_concept2
math:test_ibeta_inv_real_concept3
math:test_ibeta_inv_real_concept4
math:test_ibeta_long_double
math:test_ibeta_real_concept1
math:test_ibeta_real_concept2
math:test_ibeta_real_concept3
math:test_ibeta_real_concept4
math:test_igamma
math:test_igamma_inv_double
math:test_igamma_inv_float
math:test_igamma_inv_long_double
math:test_igamma_inv_real_concept
math:test_igamma_inva_double
math:test_igamma_inva_float
math:test_igamma_inva_long_double
math:test_igamma_inva_real_concept
math:test_laguerre
math:test_legendre
math:test_spherical_harmonic
math:test_zeta
math:tools_roots_inc_test
spirit/test:karma_repeat1-p3
Tests that failed in darwin-4.2.1 but passed in clang-darwin-4.2.1 (60)
gil:gil_tests
log:core
log:filt_attr
log:filt_has_attr
log:form_attr
log:form_date_time
log:form_format
log:form_if
log:form_message
log:form_named_scope
math:multiprc_concept_check_1
math:multiprc_concept_check_3
math:multiprc_concept_check_4
multiprecision:test_rational_io_cpp_int
numeric/ublas:concepts
optional:optional_test_ref
spirit/classic:scanner_value_type_tests
spirit/classic:scanner_value_type_tests_debug
test:basic_cstring_test
thread:test_latch
tr1:tr1_has_nothrow_assign_test
tr1:tr1_has_nothrow_constr_test
tr1:tr1_has_nothrow_copy_test
tr1:tr1_has_trivial_destr_test
tr1:tr1_has_virtual_destr_test
tr1:tr1_is_class_test
tr1:tr1_is_union_test
type_erasure:test_add
type_erasure:test_add_assign
type_erasure:test_assign
type_erasure:test_callable
type_erasure:test_construct
type_erasure:test_construct_cref
type_erasure:test_construct_ref
type_erasure:test_deduced
type_erasure:test_dereference
type_erasure:test_equal
type_erasure:test_forward_iterator
type_erasure:test_free
type_erasure:test_increment
type_erasure:test_less
type_erasure:test_member
type_erasure:test_negate
type_erasure:test_nested
type_erasure:test_reference
type_erasure:test_relaxed
type_erasure:test_same_type
type_erasure:test_stream
type_erasure:test_subscript
type_erasure:test_tuple
type_traits:has_nothrow_assign_test
type_traits:has_nothrow_constr_test
type_traits:has_nothrow_copy_test
type_traits:has_trivial_destructor_test
type_traits:has_virtual_destructor_test
type_traits:is_class_test
type_traits:is_nothrow_move_assignable_test
type_traits:is_nothrow_move_constructible_test
type_traits:is_stateless_test
type_traits:is_union_test
-- Marshall
Marshall Clow Idio Software <mailto:mclow.lists_at_[hidden]>
A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
-- Yu Suzuki