On Wed, May 21, 2008 at 8:49 AM, Doug Gregor <doug.gregor@gmail.com> wrote:
On Wed, May 21, 2008 at 7:49 AM, Beman Dawes <bdawes@acm.org> wrote:
> How does the work on CMake that Troy and Doug are doing relate to the work
> on Bitten that Dave is doing?

Like BBv2, CMake has its own way to do regression testing as part of
the build. With CMake, it's through a separate (but bundled) program
called CTest that is configured by CMake. CTest is responsible for
running tests locally and producing XML results, which it can upload
to a server through XML-RPC.

FWIW, one of my problems with the current system is that the volume of XML results chews up too much bandwidth (DSL, 3.0 mbs down/ .7 mbs up) for me to contribute as much testing as I'd like. Moving the servers to OSL will help, since I won't have to run the release server here, and Rene won't have to run the trunk server on his machine.
 
CTest is meant to work with Dart, Dart2,
and CDash; part of what Dave and Troy are doing is to make CTest work
with Bitten as well, because Bitten will integrate with Trac and
handle our regression reporting.  Dave will also be working to give
Bitten a more Boost-friendly user interface, categorizing sets of
tests by library and so on: this is the major feature that nearly
every  regression-reporting system is missing.

Sounds good.

Dave has also been talking about the need for overall measures of what he calls the health of Boost. To me, that should include QA testing, such as the Boost inspect program, as part of the regular testing and reporting system. So I'd like to see a library marked as failing if it is getting inspect failures. That would be a big incentive to expand the inspect program to cover more areas, such as documentation probes.
 
> Who will be responsible for the scripts that testers rely on? Rene?

I'm guessing Troy or Dave. Troy wrote the original tester scripts for
the CMake-based build system.

My concern about scripts are ease-of-use and robustness. Before Rene started working on the scripts last fall, it took several (and sometimes a lot more) emails back and forth to get a new tester up and running. Even then testing often stalled for long periods, and more emails went back and forth, because something broke. If there were changes to testing setup or configuration, more emails and more delays resulted.

As Rene adding ease-of-use and robustness features like better command help, self-bootstrapping, and choosing defaults with reliability in mind, the number of emails and testing stalls dropped dramatically. The Linux tests I run used to require my intervention perhaps every couple of days. Now they run many months without intervention. Other testers have had similar success.

Rene also been able to respond quickly, and keep an eye on the testing process.

I want to be sure we don't regress in these areas.

> Is the plan to start small with a few libraries and testers, and then
> gradually roll-out to all libraries and all testers?

All libraries need to build and test so that we can verify that the
build system is working properly.  Plus, we would like to be able to
build binary installers with CMake sooner, and not wait for all of the
regression-testing work to be completed.

Yes, that would be helpful. Building daily release snapshots has been very helpful, and the logical conclusion would be to both build and test at least one binary installer as part of each snapshot.

Does cpack also build 7zip archives? They are becoming more popular, and Boost has been providing them for several releases.

We'll start with a few testers, and widen the testing base when we
feel that the testing system is relatively robust. We don't want to
take resources away from the current testing system too early.

Makes sense.

I'd like to start testing with the new system (does it have a name?) as soon as a script becomes available.
 
> At what point do we
> turn off the current build/test/report system?

I don't know. I think it will have to coincide with a switch to CMake,
unless someone goes back and makes bjam/BBv2 support Bitten.

OK. I'd like to be sure both a significant number of testers find the test runner portion acceptable, and a significant number of developers find the developer reporting side acceptable before we switch. It also wouldn't hurt to make sure user reporting is meeting user needs.
 
> Are any metrics being collected? I'm interested in time and bandwidth
> utilization for both test clients and the central server, since these are
> limiting bottlenecks with the current mechanism.

CTest collects some statistics for compile-time (in aggregate) and
run-time of individual tests. It's in the XML, and we could certainly
report that back to Bitten.

What I'm interested in (as a release manager concerned with having enough testers for good coverage) are some overall measures of what kinds of demands we are placing on testers and servers. For example, how much time (wall clock? CPU?) is the tester using per day? How many bytes received? Transmitted? Similar for servers. No need to beat these to death; just three or four overall indications of resources consumed. Useful surrogates if those particular numbers aren't available.

 
> Who will host the central server or test results repository?

Since Bitten integrates with Trac, we'll need to host this at OSL
where the Trac lives.

Excellent! OSL has been very reliable, and a key element in the success of Boost!
 

> Which is the best mailing list for keeping the entire set of
> build/test/report needs coordinated?

CMake-related build/test issues should go to this list. Once the
testing system is being used by others, boost-testing will still be
the coordination point for regression testers.

Thanks for the responses. Very helpful.

--Beman