 # Boost :

From: Guillaume Melquiond (gmelquio_at_[hidden])
Date: 2003-06-20 15:04:53

On Fri, 20 Jun 2003, Gennaro Prota wrote:

> >> | [*] It is not even true. Due to "double rounding" troubles,
> >> | using a higher precision can lead to a value that is not the
> >> | nearest number.
> >>
> >> Is this true even when you have a few more digits than necessary?
> >> Kahan's article suggested to me that adding two guard decimal digits
> >> avoids this problem. This why 40 was chosen.
> >
> >I don't know if we are speaking about the same thing.
>
> I don't know either. What I know is the way floating literals should
> work:
>
> A floating literal consists of an integer part, a decimal point,
> a fraction part, an e or E, an optionally signed integer exponent,
> and an optional type suffix. [...]
> If the scaled value is in the range of representable values for its
> type, the result is the scaled value if representable, else the
> larger or smaller representable value nearest the scaled value,
> chosen in an implementation-defined manner.

I know this part of the standard. But it doesn't apply in the situation I
was describing. I was describing the case of a constant whose decimal (and
consequently binary) representation is not finite. It can be an irrational
number like pi; but it can also simply be a rational like 1/3.

When manipulating such a number, you can only give a finite number of
decimal digit. And so the phenomenon of double rounding I was describing
will occur since you first do a rounding to have a finite number of
digits and then the compiler do another rounding (which is described by
the part of the standard you are quoting) to fit the constant in
the floating-point format.

Just look one more time at the example I was giving in the previous mail.

> Of course "the nearest" means nearest to what you've actually written.
> Also, AFAICS, there's no requirement that any representable value can
> be written as a (decimal) string literal. And, theoretically, the

I never saw any computer with unrepresentable values. It would require it
manipulates numbers in a radix different from 2^p*5^q (many computers use
3).

> "chosen in an implementation-defined manner" above could simply mean
> "randomly" as long as the fact is documented.

The fact that it does it randomly is another problem. Even if it was not
random but perfectly known (for example round-to-nearest-even like in the
IEEE-754 standard), it wouldn't change anything: it would still be a
second rounding. As I said, it is more of an arithmetic problem than of a
compilation problem.

> Now, I don't even get you when you say "more digits than necessary".

What I wanted to say is that writing too much decimal digits of a number
doesn't improve the precision of the constant. It can degrade it due to
the double rounding. In conclusion, when you have a constant, it is better
to give an exact representation of the nearest floating-point number
rather than writing it with 40 decimal digits. By doing that, the compiler
cannot do the second rounding: there is only one rounding (the one you
did) and you are safe.

> One thing is the number of digits you provide in the literal, one
> other thing is what can effectively be stored in an object. I think
> you all know that something as simple as
>
> float x = 1.2f;
> assert(x == 1.2);
>
> fails on most machines.

Yes, but it's not what I was talking about. I hope it's a bit more clear
now.

Guillaume