Boost logo

Boost Testing :

Subject: Re: [Boost-testing] Testing direction (was: Request for funding - Test Assets)
From: Rene Rivera (grafikrobot_at_[hidden])
Date: 2015-12-13 11:47:54


On Sun, Dec 13, 2015 at 8:59 AM, Tom Kent <lists_at_[hidden]> wrote:

> On Sat, Dec 12, 2015 at 8:35 PM, Rene Rivera <grafikrobot_at_[hidden]>
> wrote:
>
>> Not going to comment on the aspect of purchasing a machine. But will
>> point out that the real benefit to having dedicated machines is that of
>> having non-traditional setups (OS+toolset). I.e. dedicated machines give
>> you coverage.
>>
>> On Sat, Dec 12, 2015 at 8:08 PM, 'Tom Kent' via Boost Steering Committee
>> <boost-steering_at_[hidden]> wrote:
>>
>>>
>>> I also think that, like Niall said, we should move towards CI style
>>> testing where every commit is tested, but that is going to be a *huge*
>>> transition.
>>>
>>
>> I wouldn't say huge.. Maybe "big".
>>
>>
>>> I would love to see direction on this in general from the steering
>>> committee, and am encouraged that almost all new libraries already have
>>> this.
>>>
>>
>> I can't speak for the committee. But as testing manager I can say moving
>> Boost to CI is certainly something I work on a fair amount.
>>
>> Retrofitting it onto all the existing libraries will be an undertaking.
>>>
>>
>> Working on that. Getting closer and closer.
>>
>
> I didn't realize this was being actively pursued. How many of the
> existing libraries have been setup for this? Is there a broader strategy
> for getting the individual maintainers to take these changes? Any simple
> tasks I could help with in my (very limited) spare time?
>

The only library so far I have is my own (Predef.. But that's an easy one).
There are a lot of small changes needed to deal with this. You can look at
the current functionality for this CI testing here <
https://github.com/boostorg/regression/tree/develop/ci/src> (plus the
.travis.yml and appveyor.yml in Predef).

One in particular I did a PR for BB as it was a functionally "radical"
change (see <https://github.com/boostorg/build/pull/83>). But I will likely
move on without that change anyway. My plan was to start on the "Robert"
version of isolated testing (checking out a library to a particular commit,
but checking out the monolithic Boost to a release commit).

My next step on that was to move to testing another more complex library
using the CI script (and extend the script as needed).

As for broader strategy.. At some point when I have reasonably complete CI
support (Travis and Appveyor and a complex library) I'll just start making
changes to all libraries. As I know that getting authors to do this work
will likely not work. I.e. I'll take my usual "just do it" approach :-) As
for resources.. My goal is to move the testing of the common
toolsets/platforms all to cloud based services. Relieving our dedicated
tester to concentrate on the not so common & bleeding edge toolsets (such
as Android, IBM, Intel, BSD, etc configurations).

> I would suggest that as an interim step, we update our existing regression
>>> facility so that the runners just specify what their configuration is
>>> (msvc-12.0, gcc-4.9-cpp14, clang-3.3-arm64-linux, etc) and we have a
>>> centralized server that gives them a commit to test (and possibly a
>>> specific test to run).
>>>
>>
>> Not sure what you mean by that.
>>
>>
>>> They would also send their results back to this server (via http post,
>>> no more ftp!) in a format (json) that can be immediately displayed on the
>>> web without interim processing.
>>>
>>
>> It's not actually possible to eliminate the processing. Although it's
>> possible to reduce it to a much shorter time span than what it is now. That
>> processing is what adds structure and statistics that we see now in
>> results. Without it theres considerably less utility in the results. And I
>> can say that because..
>>
>
> Here's the idea I've been pondering for a while...curious what you (and
> others) think of it....
>
> Currently when a user starts the regression tests with run.py, the specify
> the branch that they want to run (master or develop) and then get the
> latest commit from that branch. I would like to remove this from the
> user's control. When they call run.py, they just pass in their
> configuration and their id string and run.py goes out to a server to see
> what needs to be run. By default this server could just alternate between
> giving back the latest master/develop (or maybe only run master 1 in 3
> times). That would give us uniform coverage of master and develop branches.
>

Interesting. I'll have to think about that some.

This would also enable us to have a bit more control around release time.
> Once an RC is created, we could give each runner that commit to test
> (allowing master's latest to have changes), then we could get tests of what
> is proposed for the release (something that is a bit lacking right now,
> although the fact that we freeze the master branch gets close to this).
> After a release, we could save the snapshot of tests and archive that so
> that future users of that release could have something documenting its
> state.
>
> As far as processing the output, what I was envisioning was moving a lot
> more of it to each test runner and the rest to the client side with some
> javascript. To re-create the summary page, each runner could upload a json
> file with all the data for their column: pass/fail, percent failed,
> metadata. Then we could run a very lightweight php (or other) script on the
> server that keeps track of which json files are available (i.e. all of the
> ones uploaded, except those not white-listed on master) and whenever a user
> opens that page, their browser is given that list of json files which the
> browser then downloads, renders and displays. There would be a similar
> pattern for each of the libraries' individual result summaries. Which could
> link to separately uploaded results for each failure.
>

That's not far from what I plan to do, and have partly working. Except for
the aspect of doing as much on the client side as you say. I attempted to
do that early on in my work and found that it just didn't work. First there
wasn't enough testing time computation that could be done to facilitate the
server/client side. As much of the computation cuts across various testers.
Second it conflicted with one of my goals of making the testing side
simpler to increase the number of testers (which is a common complaint
currently).

Right now what I have is: Testers upload results as they happen (each test
would do a post to the Google cloud). When a test run is done the data is
aggregated (again in the Google cloud) to generate the collective stats &
structure (it's this part that I'm optimizing at the moment). When a person
browses to the results the web client downloads json describing that page
of results, and renders a table with client side C++ (emscripten
currently). Note, I try and only generate on the server the minimum stats
information possible to reduce that processing time and shifting as much as
possible to the web client.

> I'm not an expert on what the report runner actually does, but I think
> that is the majority of it, right?
>
>
>>
>>> Even this kind of intermediate step would be a lot of development for
>>> someone....and I don't have time to volunteer for it.
>>>
>>
>> I've been working on such a testing reporting system for more than 5
>> years now. This past year I've been working on it daily. Mind you only
>> being able to devote a very limited amount of time daily. Recently I've
>> been working on processing performance, in trying to get processing of 100
>> test results to happen in under 3 seconds (running the Google cloud
>> infrastructure).
>>
>
> Thanks for all the amazing work you've done with the testing
> infrastructure, you definitely don't get enough recognition for it!
>
> Tom
>
>
> _______________________________________________
> Boost-Testing mailing list
> Boost-Testing_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-testing
>

-- 
-- Rene Rivera
-- Grafik - Don't Assume Anything
-- Robot Dreams - http://robot-dreams.net
-- rrivera/acm.org (msn) - grafikrobot/aim,yahoo,skype,efnet,gmail


Boost-testing list run by mbergal at meta-comm.com