Subject: Re: [boost] accurate sum accumulator (kahan)
From: Topher Cooper (topher_at_[hidden])
Date: 2010-07-26 15:14:04
First off, note that in the test case part of the inaccuracy is due to the fact that 1F-6 cannot be represented exactly in floating point. Obviously, however you do it, if you add 1000000 numbers that are not precisely equal to 1/1000000 together then the correct answer will not be 1.0.
As for naming -- I'm always uncomfortable using a keyword like "accurate" -- since there are many ways to improve accuracy, with Kahan's algorithm being only one.
For example: for some situations presorting the numbers in ascending order can improve accuracy.
A two pass algorithm, when possible, will probably give you the same result as Kahan, with, ignoring paging, similar costs. First sum the numbers. Divide the result by the number of values. Sum the differences between each value and that mean value. Subtract the result from the original sum. Kahan requires 4 add/sub ops per value, the two-pass 3.
Another technique: recursively sum the lower half and the upper half and add the results. Using an explicit stack and doing this iteratively would probably improve on Kahan in both accuracy and time. This should not be combined with an ascending sort (or if you do, use interleaved splits rather than lower/upper). Ignoring the bookkeeping, this requires a single addition per value.
I'm not suggesting that any of these be added, but if they were -- what would they be called?
On Jul 26, 2010, at 11:58 AM, Paul A. Bristow wrote:
>> -----Original Message-----
>> From: boost-bounces_at_[hidden] [mailto:boost-bounces_at_[hidden]] On Behalf Of David Abrahams
>> Sent: Monday, July 26, 2010 4:51 PM
>> To: boost_at_[hidden]
>> Subject: Re: [boost] accurate sum accumulator (kahan)
>> At Mon, 26 Jul 2010 16:20:16 +0100,
>> Paul A. Bristow wrote:
>>>> At Mon, 26 Jul 2010 14:59:29 +0100,
>>>> Paul A. Bristow wrote:
>>>>> PS As for naming, I'd vote for sum and accurate_sum (to keep backward compatibility).
>>>> It's possible I'm completely off the mark here, but I think maybe we
>>>> want sum and quick_sum.
>>> This would be ideal - but doesn't accumulator already use 'sum' so it would change behaviour of existing programs?
>> Technically, of course it would, but aren't they getting the wrong answer currently?
> Yes (a bit, sometimes, probably) - but quicker ;-)
> (Seriously - a change in accuracy might mean that tests just fail and this might cause more trouble than it was worth.
> This being floating point, even the Kahan method doesn't guarantee a *exact* answer, just a more accurate one).
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost