Boost logo

Boost Users :

From: Jeff Garland (jeff_at_[hidden])
Date: 2006-09-14 13:57:36


Scott Meyers wrote:
> Jeff Garland wrote:
>> Testing in total isolation is a myth. To be a little softer -- it's really
>> going to depend on the type of class you are testing whether or not it can be
>> rationally tested in isolation. If you haven't lately, you should re-read
>> Lakos's treatment of this subject in Large Scale C++ Software Design. This
>> book is 10 years old, but he breaks down testability in a way I've not seen
>> anyone else do since doing testing became all the rage. Most of the 'test
>> first' stuff I've seem ignores the inherent untestability of some software.
>
> That's been my impression. One of the things I've been trying to figure
> out wrt the whole testing hoopla is how well it translates to large
> projects and how it has to be adjusted when things move beyond toy
> examples. And yes, I probably should go back and reread Lakos.

Well, the testing hoopla 'applies' to the extent that in my experience big
systems that *don't* have significant testing discipline never see the light
of day. That is, they fail under an avalanche of integration and basic
execution problems before ever being fielded. As an aside, I always get a
good laugh out of all the agonizing by various folks over how this and that
testing technique that they've *recently discovered* on a 15 person project
applies to large systems. Big systems have been using these approaches for
years...or they failed. Now, that's not to say that the level of rigor advised
by many of the test-first proponents really happens on big projects either.
Is it economical to spend time writing code to check a 'getter'/'setter'
interface that will just obviously work? The answer is no. In fact, the
testing you can avoid, just like the coding you can avoid, is really a big
part of successful big system development.

 From my experience the best-practice of testing depends on what the code is
used for and what else depends on it. If it's a widely used library (say
date-time to pick one :) you want it to be very well unit tested because
thousands of LOC will depend on it. Every time you modify it you have to
retest a large amount of code. It also turns out to be easy to unit test
because it doesn't depend on much. On the other hand, take the case of a user
interface which has no other code that depends on it -- my advice is to skip
most of the unit and automated tests. For one thing, it's very hard to write
useful test code. For another, a human can see in 1 second what a machine can
never see (ugly layout, poor interaction usability, etc). Since testing at
the 'top level' of the architecture depends on basically all the other
software in the system it tends to change rapidly -- people can quickly adjust
to the fact that the widgets moved around on the screen, test programs tend to
be fragile to these sort of changes. And finally, since no other code depends
on this code it isn't worth the time -- you can chance it at will. Bottom line
is that not all code is created equal w.r.t to the need or ease of testing.

Of course the landscape isn't static either -- some good things have happened.
One thing that's really changed is that the test first/XP/Agile folks have
managed to convince developers that they actually need to execute their code
before they deliver -- a good thing. This often wasn't common practice 10
years ago. Also, developers have more and more pre-tested code to pull off the
shelf -- better libraries and less low level code to write and test.

Even with all that, I still say testing isn't enough because I know that even
the stuff that's *easy* to test will have gaps. There are literally thousands
of Boost date-time tests (2319 'asserts' to be exact) that run in the
regression every day, but I don't believe for a minute that the library is
bug-free or can't be the source of bugs in other code.

As an example of the latter, initially the date class had no default
constructor and it is built to guarantee that you can't construct an invalid
date. It's also an immutable type, so you can't set parts of a date to make
an invalid one (you can assign, but you have to go thru checks to do that). I
wanted these properties so that I could pass dates around in interfaces and
wouldn't have to 'check' the precondition that a date is valid when I go to
use it. All good, except that dates also allowed 'not_a_date_time',
+infinity, and -infinity as a valid values. So if you call date::year() on
something that's set to not_a_date_time the results are undefined.

Now it's trivial to write some 'incorrect' code and a bunch of tests that
will always work:

void f(const date& d)
{
    int year = d.year(); //oops....fails in some cases
}

should really always be:

   if (d.is_special()) {
      //do something here
   }
   else {....
      int year = d.year()

So going back to the default constructor, I eventually added one that
constructs to not_a_date_time after many users requested it. Mostly for use
in collections that need this. A very logical choice for default, but my worry
all along was that people would make the mistake above. That is, now instead
of being forced to think about putting some sort of correct date value or
using not_a_date_time explicitly:

    date d(not_a_date_time);

they can just say

    date d;

Aside from the obvious loss of readability, I worried that with just these few
lines of code the correctness of a larger program can be undermined by failing
to check the special states.

So far, I'm not aware of anyone having an issue with this in a large program,
but I'd be shocked if someone didn't create a bug this way eventually. It's
trivial to write and test code that always uses 'valid dates', ship it, and
everything will work fine. Then one day someone else will unknowingly make a
call using a default constructed date and 'boom' a function that's been
working fine and is fully 'tested' will blow up with unexpected results.

So, is it the right set of design decisions? I don't know, but there's
clearly a tension between correctness, 'ease of use', and overall
applicability. My take on the EventLogger example is that it's the wrong set
of choices. There's very little valid use of the object without the stream.
The stream is a low-level stable library that all programmers should know
anyway. It's wide open to creating runtime errors that are not localized, and
it's low level library that I would expect to use all over in a program. So
I'd want the number of error modes to be as small as possible, because I'm
certain they won't be writing code to test all the cases....

Jeff


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net