|
Boost : |
From: Paul A. Bristow (boost_at_[hidden])
Date: 2002-02-05 10:23:44
I finally found the full answer to this question in two references below.
The bottom line is that, using the formulae below, you need MORE than
digits10 (which is the lower number, 15 for double), calculated for example
for double to be 17 for input, two more than digits10, which is the number
guaranteed to be correct on output (15). (Incidentally, it avoid lots of
confusion of this value for INPUT (17) was part of numeric_limits, as well
as the digits10 provided for OUTPUT. Perhaps
numeric_limits<FPType>::in_digits10??
W Kahan in his report on the status of IEEE754 in 1996, gives two
significant digits values for each format:
"
The lower value is at least floor((n - 1) * log10(2) significant digits,
and the upper, ceil(1 + n * log10(2)) significant digits,
where n is number of significand bits.
For float, floor (24 -1) * 0.301 = 6 & <<<<<<<< digits10 correct for
output
ceil(1 + 24 * 0.301) = 9, <<<<<<<<<< needed for input
& for double 15 and 17, extended (long double?) 18 & 21, quadruple 33 & 36).
For example, for float:
If a decimal string with at most the lower significant digits (6) is
converted to float precision and the converted back to the same number of
significant digits (6), then the string should match the original 6 digits.
If a float value is converted to a decimal string with at least the upper
number of significant digits (9), and then converted back to float, then the
final number must match the original float number.
Note the asymmetry. The upper is the number of significant digits required
for input, the lower is number guaranteed on output.
"
extracted from
http://http.cs.berkley.edu/~wkahan/ieee754status/ieee754.ps page 4 gives
significant digits for real formats.
The following is also relevant.
ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps
William D Clinger, In Proceedings of the 1990 ACM Conference on Principles
of Programming Languages, pages 92-101.
How to read Floating-point accurately.
Abstract: Consider the problem of converting decimal scientific notation for
a number into the best binary floating-point approximation to that number,
for some fixed precision. This problem cannot be solved using arithmetic of
any fixed precision. Hence the IEEE Standard for Binary Floating-Point
Arithmetic does not require the result of such a conversion to be the best
approximation.
This paper presents an efficient algorithm that always finds the best
approximation. The algorithm uses a few extra bits of precision to compute
an IEEE-conforming approximation while testing an intermediate result to
determine whether the approximation could be other than the best. If the
approximation might not be the best, then the best approximation is
determined by a few simple operations on multiple-precision integers, where
the precision is determined by the input. When using 64 bits of precision to
compute IEEE double precision results, the algorithm avoids higher-precision
arithmetic over 99% of the time.
> Actually I believe not even that is sufficient. Due to roundoff errors
> the
> last bit can still be incorrect. To be really accurate we might need
> a non-decimal format.
>
> Matthias
The above assumes that the compiler can read the string of decimal
correctly. I believe that this is now true for most compilers (though I
don't have any firm evidence - perahps boosters should carry out some
tests?). The method for math constants (awaiting a new format suggestion
from Michael Kenniston - looks excellent to me - to be posted by him) relies
on this assumption for portability. Any compiler which gets the wrong
answer is non-compliant.
This is MUCH simpler than trying to do things in "a non-decimal format".
Dr Paul A Bristow, hetp Chromatography
Prizet Farmhouse
Kendal, Cumbria
LA8 8AB UK
+44 1539 561830
Mobile +44 7714 33 02 04
mailto:pbristow_at_[hidden]
>
> -----Original Message-----
> From: Matthias Troyer [mailto:troyer_at_[hidden]]
> Sent: Tuesday, February 05, 2002 7:07 AM
> To: boost_at_[hidden]
> Subject: Re: [boost] a precision problem of lexical_cast
> >
> On Tuesday, February 5, 2002, at 05:10 AM, Jon Wang wrote:
>
> >
> > Maybe I've made a mistake. For double, the proper precision should be
> > 15 rather than 16.
> >
> > Mr Andy Koenig said "15 is not portable across all floating-point
> > implementations. However, it is portable across all implementations
> > that support IEEE floating-point arithmetic, which is most computers
> > that are in common use today. If you want to do better than that, you
> > might consider using numeric_limits<double>::digits10, which is the
> > number of significant base-10 digits that can be accurately represented
> > in a double." I really appreciate his help.
> >
> > So maybe we can make such improvements:
> >
> > #include <boost/limits.hpp>
> > //...
> > template<typename Target, typename Source>
> > Target lexical_cast(Source arg) {
> > //...
> > Target result;
> >
> interpreter.precision(std::numeric_limits<Source>::digits10);
> > if( !(interpreter << arg) ||
> > !(interpreter >> result) ||
> > !(interpreter >> std::ws).eof())
> > //...
> > }
> >
> > And we can get the right result.
> >
>
> Actually I believe not even that is sufficient. Due to roundoff errors
> the
> last bit can still be incorrect. To be really accurate we might need
> a non-decimal format.
>
> Matthias
>
>
> Info: http://www.boost.org Send unsubscribe requests to:
> <mailto:boost-unsubscribe_at_[hidden]>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>
>
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk