Subject: [Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-10-13 23:46:52
#12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of
convert_to<double>() is a subnormal
------------------------------+----------------------------
Reporter: Michael Shatz | Owner: johnmaddock
Type: Bugs | Status: new
Milestone: To Be Determined | Component: multiprecision
Version: Boost 1.62.0 | Severity: Problem
Keywords: |
------------------------------+----------------------------
When convert_to<double>() applied to numbers in subnormal range the value
is initially rounded to 53-bit precision and then
rounded again, by ldexp() routine, to the target precision which is lower
than 53 bits.
The problem is exactly the same as the well-known problem that makes it
virtually impossible to produce 100% IEEE-754 compliant results with Intel
x87 FPU.
Because of double rounding, resulting subnormals are not always the
closest representable double-precision numbers to the original value
or, in case of tie, they are not always even.
Exactly the same problem applies to convert_to<float>()
The attached files demonstrate the problem and one possible workaround,
not necessarily that fastest, but likely the simplest.
-- Ticket URL: <https://svn.boost.org/trac/boost/ticket/12527> Boost C++ Libraries <http://www.boost.org/> Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:20 UTC