From: John Maddock (john_at_[hidden])
Date: 20070130 11:02:17
John Maddock wrote:
>> John Maddock wrote:
>>> Which leads on to a quick comparison I ran against the "known good"
>>> data here: http://www.itl.nist.gov/div898/strd/univ/homepage.html
>>> The test program is attached, and outputs the relative error in the
>>> statistics calculated.
>>
>> Oops, forgot the attachment, here it is.
After some more investigation:
1) I fell into the N vs N1 standard deviation trap myself, corrected test
file attached: hopefully right this time! The output is now:
PI data:
Error in mean is: 0
Error in SD is: 3.09757e016
Lottery data:
Error in mean is: 6.57202e016
Error in SD is: 0
Accumulator 2 data:
Error in mean is: 9.25186e015
Error in SD is: 0.000499625
Accumulator 3 data:
Error in mean is: 5.82076e016
Error in SD is: 0.071196
Accumulator 4 data:
Error in mean is: 9.87202e015
Error in SD is: 1.#IND
2) I reran the last calculation using NTL::RR at 1000bit precision, the
final test case does give a sensible answer now rather than a NaN. But...
3) The results for standard deviation (taken as the square root of the
variance) are still off, In the last "torture test" set of data from
http://www.itl.nist.gov/div898/strd/univ/data/NumAcc4.dat I see:
Test  Result  Rel Error
Your code:  0.1046776164  0.04677616448
Naive RMSD  0.09995003803  0.0004996197271
True Value  0.1  0
The "Naive RMSD" just does a very naive "root mean square deviation from the
mean" calculation.
I believe (but haven't checked) that the remaining difference between this
"naive" calculation and the true value results from the inputs having
inexact binary representations  I would need to lexical_cast everything
from a string representation to an NTL::RR rather than storing as an
intermediate double to verify. Can't be bothered to test this at present
I'm afraid :(
It's still rather alarming though that the "naive" method appears to be 100
times more accurate than the accumulator.
Hoping I'm doing something wrong yours,
John.
