Boost logo

Boost Testing :

Subject: Re: [Boost-testing] Comparing different runs of regression tests
From: Marshall Clow (mclow.lists_at_[hidden])
Date: 2013-07-02 14:59:11


On Jul 2, 2013, at 7:14 AM, Ben Pope <benpope81_at_[hidden]> wrote:

> On 27/06/13 03:13, Marshall Clow wrote:
>> Over the last few weeks, I've been collecting the logs of my regression tester (the XML files that get uploaded).
>> They include several slightly different configurations:
>>
>> darwin gcc 4.2.1 + libstdc++ compiling for C++03
>> clang-darwin Apple-released clang + libstdc++ compiling for C++03
>> clang-darwin-tot Current "tip-of-tree" clang + libstdc++ compiling for C++03
>> clang-darwin-11 Apple-released clang + libc++ compiling for C++11
>> clang-darwin-tot11 Current "tip-of-tree" clang + libc++ compiling for C++11
>> clang-darwin-asan Current "tip-of-tree" clang + libstdc++ compiling for C++03 using Address Sanitizer
>> clang-darwin-asan11 Current "tip-of-tree" clang + libc++ compiling for C++11 using Address Sanitizer
>>
>> Having this data, I started to wonder.
>> * What are the differences in the results for two different days?
>> Example: What changed in the test results between Tuesday and Wednesday?
>> * What are the differences in the results between two different configurations?
>> Example: What differences are there between using gcc and clang?
>> Example: What differences are there between C++03 vs. C++11?
>> Example: What differences are there between "released clang" and "tot-clang"?
>> Example: What differences are there when you turn on Address Sanitizer?
>>
>> I've written some python scripts to help answer these questions.
>>
>> Is this kind of information interesting to anyone besides me?
>
> Yes this is interesting.
>
> I'm just about to define BOOST_THREAD_VERSION=4 for my test runners and it would be interesting to see if there is a difference in test results.

Ok. I have two tools. Once compares two runs (all toolsets)
Example output:

        $ ~/bin/boostLog.py marshall-mac-trunk-0701.xml marshall-mac-trunk-0702.xml
        Source: trunk (revision 84896)
        Date: 2013-06-24T05:05:35Z
          171 tests have empty toolset names
          554 tests have empty test names
        Test count by toolset (30680 total tests)
           171 (19 failed)
          6100 (276 failed) clang-darwin-11
          6102 (137 failed) clang-darwin-4.2.1
          6105 (144 failed) darwin-4.2.1
          6100 (417 failed) clang-darwin-asan
          6102 (138 failed) clang-darwin-tot

        Source: trunk (revision 84896)
        Date: 2013-06-24T05:05:35Z
          171 tests have empty toolset names
          554 tests have empty test names
        Test count by toolset (30680 total tests)
           171 (19 failed)
          6100 (276 failed) clang-darwin-11
          6102 (137 failed) clang-darwin-4.2.1
          6105 (144 failed) darwin-4.2.1
          6100 (417 failed) clang-darwin-asan
          6102 (138 failed) clang-darwin-tot

        New failing tests (0)

        New Passing tests (0)

The other one compares two configs within the same run:

        mclow$ ~/bin/b_logcompare.py marshall-mac-trunk-0702.xml darwin-4.2.1 clang-darwin-4.2.1
        darwin-4.2.1 has 6105 tests
        clang-darwin-4.2.1 has 6102 tests
        darwin-4.2.1 has 5990 tests with unique names
        clang-darwin-4.2.1 has 5990 tests with unique names

        Tests that failed in clang-darwin-4.2.1 but passed in darwin-4.2.1 (55)
          build:prebuilt
          math:hypot_test
          math:log1p_expm1_test
          math:powm1_sqrtp1m1_test
          math:test_airy
          math:test_bessel_i
          math:test_bessel_j
          math:test_bessel_k
          math:test_bessel_y
          math:test_beta
          math:test_carlson
          math:test_cbrt
          math:test_digamma
          math:test_ellint_1
          math:test_ellint_2
          math:test_ellint_3
          math:test_erf
          math:test_expint
          math:test_gamma
          math:test_hermite
          math:test_ibeta_double
          math:test_ibeta_float
          math:test_ibeta_inv_ab_double
          math:test_ibeta_inv_ab_float
          math:test_ibeta_inv_ab_long_double
          math:test_ibeta_inv_ab_real_concept1
          math:test_ibeta_inv_ab_real_concept2
          math:test_ibeta_inv_ab_real_concept3
          math:test_ibeta_inv_double
          math:test_ibeta_inv_float
          math:test_ibeta_inv_long_double
          math:test_ibeta_inv_real_concept1
          math:test_ibeta_inv_real_concept2
          math:test_ibeta_inv_real_concept3
          math:test_ibeta_inv_real_concept4
          math:test_ibeta_long_double
          math:test_ibeta_real_concept1
          math:test_ibeta_real_concept2
          math:test_ibeta_real_concept3
          math:test_ibeta_real_concept4
          math:test_igamma
          math:test_igamma_inv_double
          math:test_igamma_inv_float
          math:test_igamma_inv_long_double
          math:test_igamma_inv_real_concept
          math:test_igamma_inva_double
          math:test_igamma_inva_float
          math:test_igamma_inva_long_double
          math:test_igamma_inva_real_concept
          math:test_laguerre
          math:test_legendre
          math:test_spherical_harmonic
          math:test_zeta
          math:tools_roots_inc_test
          spirit/test:karma_repeat1-p3

        Tests that failed in darwin-4.2.1 but passed in clang-darwin-4.2.1 (60)
          gil:gil_tests
          log:core
          log:filt_attr
          log:filt_has_attr
          log:form_attr
          log:form_date_time
          log:form_format
          log:form_if
          log:form_message
          log:form_named_scope
          math:multiprc_concept_check_1
          math:multiprc_concept_check_3
          math:multiprc_concept_check_4
          multiprecision:test_rational_io_cpp_int
          numeric/ublas:concepts
          optional:optional_test_ref
          spirit/classic:scanner_value_type_tests
          spirit/classic:scanner_value_type_tests_debug
          test:basic_cstring_test
          thread:test_latch
          tr1:tr1_has_nothrow_assign_test
          tr1:tr1_has_nothrow_constr_test
          tr1:tr1_has_nothrow_copy_test
          tr1:tr1_has_trivial_destr_test
          tr1:tr1_has_virtual_destr_test
          tr1:tr1_is_class_test
          tr1:tr1_is_union_test
          type_erasure:test_add
          type_erasure:test_add_assign
          type_erasure:test_assign
          type_erasure:test_callable
          type_erasure:test_construct
          type_erasure:test_construct_cref
          type_erasure:test_construct_ref
          type_erasure:test_deduced
          type_erasure:test_dereference
          type_erasure:test_equal
          type_erasure:test_forward_iterator
          type_erasure:test_free
          type_erasure:test_increment
          type_erasure:test_less
          type_erasure:test_member
          type_erasure:test_negate
          type_erasure:test_nested
          type_erasure:test_reference
          type_erasure:test_relaxed
          type_erasure:test_same_type
          type_erasure:test_stream
          type_erasure:test_subscript
          type_erasure:test_tuple
          type_traits:has_nothrow_assign_test
          type_traits:has_nothrow_constr_test
          type_traits:has_nothrow_copy_test
          type_traits:has_trivial_destructor_test
          type_traits:has_virtual_destructor_test
          type_traits:is_class_test
          type_traits:is_nothrow_move_assignable_test
          type_traits:is_nothrow_move_constructible_test
          type_traits:is_stateless_test
          type_traits:is_union_test

-- Marshall

Marshall Clow Idio Software <mailto:mclow.lists_at_[hidden]>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki


Boost-testing list run by mbergal at meta-comm.com