# Boost :

From: Gennaro Prota (gennaro_prota_at_[hidden])
Date: 2003-06-21 13:44:44

On Fri, 20 Jun 2003 22:04:53 +0200 (CEST), Guillaume Melquiond
<gmelquio_at_[hidden]> wrote:

[...]
>I know this part of the standard. But it doesn't apply in the situation I
>was describing. I was describing the case of a constant whose decimal (and
>consequently binary) representation is not finite. It can be an irrational
>number like pi; but it can also simply be a rational like 1/3.

I was just trying to start from something sure, such as the standard

The way I read the paragraph quoted above, "the scaled value" is the
value intended in the mathematical sense, not its truncated internal
representation. So the compiler must behave *as if* it considered all
the digits you provide and choose the nearest element (smaller or
larger - BTW C99 is different in this regard). In your example:

the constant is 1.00050001236454786005785305678....
you write it with 7 seven digits: 1.000500
the floating-point format only uses 4 digits

you wouldn't write 1.000500 but, for instance, 1.00050001 and the
hypothetical base-10 implementation should then consider all the
digits even if it can store only 4 of them. Thus if it chooses the
*larger* value nearest to that it chooses 1.001. Of course if it
chooses the smaller..., but that's another story.

Just to understand each other: suppose I write

double d =
2348542582773833227889480596789337027375682548908319870707290971532209025114608443463698998384768703031934976.0;
// 2**360

The value is 2**360. On my implementation, where DBL_MAX_EXP is 1024
and FLT_RADIX is 2, isn't the compiler required to accept and
represent it exactly?

>
>When manipulating such a number, you can only give a finite number of
>decimal digit. And so the phenomenon of double rounding I was describing
>will occur since you first do a rounding to have a finite number of
>digits and then the compiler do another rounding (which is described by
>the part of the standard you are quoting) to fit the constant in
>the floating-point format.

I'm not very expert in this area. Can you give a real example
(constant given in decimal and internal representation non-decimal)?

[...]
>
>> "chosen in an implementation-defined manner" above could simply mean
>> "randomly" as long as the fact is documented.
>
>The fact that it does it randomly is another problem. Even if it was not
>random but perfectly known (for example round-to-nearest-even like in the
>IEEE-754 standard), it wouldn't change anything: it would still be a
>second rounding. As I said, it is more of an arithmetic problem than of a
>compilation problem.

But can you give a concrete example?

[...]
>> float x = 1.2f;
>> assert(x == 1.2);
>>
>> fails on most machines.
>
>Yes, but it's not what I was talking about. I hope it's a bit more clear
>now.

This was the result of some sloppy editing :-) Actually it was (with
an f suffix in both occurrences of the literal) in a little digression
about the fact that floating literals can be evaluated with more
precision than their corresponding type. But last time I've read this
it was in the C99 standard and I don't remember whether a similar
freedom is explicitly granted by C++ too. In short, ignore the
example, I should have erased it together with the digression.

Genny.