[Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal

Subject: [Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-10-13 23:46:52


#12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of
convert_to<double>() is a subnormal
------------------------------+----------------------------
 Reporter: Michael Shatz | Owner: johnmaddock
     Type: Bugs | Status: new
Milestone: To Be Determined | Component: multiprecision
  Version: Boost 1.62.0 | Severity: Problem
 Keywords: |
------------------------------+----------------------------
 When convert_to<double>() applied to numbers in subnormal range the value
 is initially rounded to 53-bit precision and then
 rounded again, by ldexp() routine, to the target precision which is lower
 than 53 bits.
 The problem is exactly the same as the well-known problem that makes it
 virtually impossible to produce 100% IEEE-754 compliant results with Intel
 x87 FPU.

 Because of double rounding, resulting subnormals are not always the
 closest representable double-precision numbers to the original value
 or, in case of tie, they are not always even.

 Exactly the same problem applies to convert_to<float>()

 The attached files demonstrate the problem and one possible workaround,
 not necessarily that fastest, but likely the simplest.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/12527>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:20 UTC