From: Martin Wille (mw8329_at_[hidden])
Date: 2007-08-03 03:12:29
Vladimir Prus wrote:
> Robert Ramey wrote:
>> Its A LOT LESS work for the developer. Under the current (old) system
>> every time a test failed I would have to investigate whether it was due
>> to an error new error in my library or some change/error in something
>> that the library depended up. It consumed waaaaay too much time. I gave
>> up commiting changes except on a very infrequent basis. Turns out
>> that the failures still occurred but I knew they weren't mine so I could
>> ignore them. Bottom line - testing was a huge waste of time providing
>> no value to a library developer.
> A situation possible under proposed system is:
> - You develop things on your branch. When your
> feature is ready, you merge from trunk. Suddenly
> half of tests in your library fails. The merge brought
> changes in about 100 different files, and you have
> to figure out what's up.
> With the current system, you'd get a failure whenever the problematic
> change is checked in. So, you'll know what some commit between 1000 and
> 1010 broke your library and it's easy to find out the offending commit
> from that.
> In other words, in the current system, if some other library breaks yours,
> you find about that immediately, and can take action.
"can" is the key word in the last sentence. Often enough, that didn't
happen. Some change to some library broke some other library,
unrecognized by the author of the change and not felt responsible for by
the maintainer of the victim library. The regression persists until
release preparation. Or some change caused regressions in the library
changed and the author assumed the regression was an artifact of the
test harness or caused by a change to a different library. When release
preps start, we have accumulated dozens or even hundreds of problems.
IMO, exactly this is the reason for the wild west impression we have
regarding the way we used CVS.
This is certainly a matter of testing resources. If tests could be run
frequently and reliably enough, then we could automatically blame the
author of the offending check-in.
We don't have the resources. I don't expect we'll have them soon, unless
there's a massive donation.
Unlike many others, I don't believe we have a fundamental problem with
our testing harness. A lot of tweaks are needed, definitely, but the
overall systems looks ok to me. We do detect regressions and other test
failures. We just don't happen to handle them timely.
To cope with the lack of resources, a more organized way of checking in
changes to the tested branch are needed. If there's only one maintainer
at a time who checks stuff in then that would compensate for our slow
testing. We'd be able to see changes to which library caused the
regressions. Admittedly, we wouldn't be able to blame an individual code
change in case of bundled updates, but we would at least know which
library to blame and who would be responsible to look into the problems
(with the help of the maintainers of the victim libraries, I suppose).
That's more than we have now.
I believe, such a way to change our procedures (from not having any
procedures to having only one committer at a time) would be
uncomfortable and would slow things down for maintainers queuing for a
chance to commit their stuff, but, overall, we would get shorter
development and release times. Releasing 1.34 took more than a year.
1.33 took similarly long. A more organized way of working would have had
reduced that time to a month per release or even less. That's almost two
years to spend in development instead of in finding out which change
caused what regression and how to fix it.
If we identify leaf (of the dependency tree) libraries, which shouldn't
be hard to do, then changes to multiple leaf libraries can be done in
parallel. This reduces the time spent waiting in the commit queue.
A harder problem is adding of a new toolset. In that case, hundreds of
test failures may pop up and nobody really feels responsible to look
into them, effectively leaving that work to the release manager, unless
he decides to consider that toolset not relevant for the release (in
which case the testing effort is wasted).
We need a way to organize addition of toolsets. The test runner can't
alone be made responsible for fixing all the problems that get reported.
Neither should the release manager be responsible for driving the
process at release preparation time.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk