Boost logo

Boost :

From: John Maddock (john_at_[hidden])
Date: 2007-01-30 11:02:17

John Maddock wrote:
>> John Maddock wrote:
>>> Which leads on to a quick comparison I ran against the "known good"
>>> data here:
>>> The test program is attached, and outputs the relative error in the
>>> statistics calculated.
>> Oops, forgot the attachment, here it is.

After some more investigation:

1) I fell into the N vs N-1 standard deviation trap myself, corrected test
file attached: hopefully right this time! The output is now:

PI data:
Error in mean is: 0
Error in SD is: 3.09757e-016

Lottery data:
Error in mean is: 6.57202e-016
Error in SD is: 0

Accumulator 2 data:
Error in mean is: 9.25186e-015
Error in SD is: 0.000499625

Accumulator 3 data:
Error in mean is: 5.82076e-016
Error in SD is: 0.071196

Accumulator 4 data:
Error in mean is: 9.87202e-015
Error in SD is: -1.#IND

2) I re-ran the last calculation using NTL::RR at 1000-bit precision, the
final test case does give a sensible answer now rather than a NaN. But...

3) The results for standard deviation (taken as the square root of the
variance) are still off, In the last "torture test" set of data from I see:

Test | Result | Rel Error

Your code: | 0.1046776164 | 0.04677616448
Naive RMSD | 0.09995003803 | 0.0004996197271
True Value | 0.1 | 0

The "Naive RMSD" just does a very naive "root mean square deviation from the
mean" calculation.

I believe (but haven't checked) that the remaining difference between this
"naive" calculation and the true value results from the inputs having
inexact binary representations - I would need to lexical_cast everything
from a string representation to an NTL::RR rather than storing as an
intermediate double to verify. Can't be bothered to test this at present
I'm afraid :-(

It's still rather alarming though that the "naive" method appears to be 100
times more accurate than the accumulator.

Hoping I'm doing something wrong yours,


Boost list run by bdawes at, gregod at, cpdaniel at, john at