Re: [Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-11-14 01:07:19

Next message: Boost C++ Libraries: "[Boost-bugs] [Boost C++ Libraries] #12606: Boost.ASIO not throwing EOF exception when using coroutine reading from pipe"
Previous message: Boost C++ Libraries: "[Boost-bugs] [Boost C++ Libraries] #12605: libraries/fc/src/thread/contet.hpp:215:5 error: no type named 'fcontext_t' in namespace 'boost::c'; did you mean 'context'?"
In reply to: Boost C++ Libraries: "[Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal"
Next in thread: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of convert_to<double>() is a subnormal"

#12527: cpp_bin_float: Anal fixation. Part 3. Double rounding when result of
convert_to<double>() is a subnormal
-------------------------------+----------------------------
  Reporter: Michael Shatz | Owner: johnmaddock
      Type: Bugs | Status: reopened
Milestone: To Be Determined | Component: multiprecision
   Version: Boost 1.62.0 | Severity: Problem
Resolution: | Keywords:
-------------------------------+----------------------------

Comment (by Michael Shatz):

I did a little more research and now I know why your new version is slow.
That's because on both of platforms that I did the tests (x64/gcc 5.3.0
on MINGW and x64/msvc 19.00.24210) library function std::ldexp() is much
slower than can be reasonably expected.
So, for not to wide cpp_bin_float types, execution time of convert_to<>()
is almost directly proportional to number of ldexp() calls.

There are circumstances when ldexp() is inlined by compiler and then it
runs several times faster, but so far I was unable to figure out what they
are.

Below is a variant of eval_convert_to() that is ideologically similar to
your code - all rounding is done with integer arithmetic, but unlike you
code it calls ldexp() exactly once. On my test platforms it is ~4 times
faster than yours.
Unfortunately, it is not complete: it only handles cases in which
std::numeric_limits<Float>::digits <= 62. Thus, works very well for IEEE
double and float, but even x87 'long double' is too much, as well as
anything wider than that.

Nevertheless, look at it. Even if you can't extend it to wider types,
IMHO, IEEE double and float are sufficiently important to be special-
cased.

{{{
template <class Float, unsigned Digits, digit_base_type DigitBase, class
Allocator, class Exponent, Exponent MinE, Exponent MaxE>
inline typename boost::enable_if_c<boost::is_float<Float>::value>::type
eval_convert_to(Float *res, const cpp_bin_float<Digits, DigitBase,
Allocator, Exponent, MinE, MaxE> &original_arg)
{
   Float ret = 0;
   switch (eval_fpclassify(original_arg)) {
     case FP_INFINITE:
       ret = std::numeric_limits<Float>::infinity();
       break;

     case FP_NAN:
       *res = std::numeric_limits<Float>::quiet_NaN();
       return;

     case FP_ZERO:
       break;

     default:
     { // == normal, because cpp_bin_float does not support subnormal
       if (original_arg.exponent() >=
std::numeric_limits<Float>::min_exponent -
std::numeric_limits<Float>::digits - 1) {
         if (original_arg.exponent() <=
std::numeric_limits<Float>::max_exponent-1) {
           int e = original_arg.exponent();
           cpp_bin_float<Digits, DigitBase, Allocator, Exponent, MinE,
MaxE> y = original_arg;
           y.exponent() = 63;
           y.sign() = 0;

           uint64_t bits;
           eval_convert_to(&bits, y);

           int nhbits = std::numeric_limits<Float>::digits;
           if (e < std::numeric_limits<Float>::min_exponent-1)
             nhbits = e - std::numeric_limits<Float>::min_exponent + 1 +
std::numeric_limits<Float>::digits;
           // round
           uint64_t lbits = bits << nhbits;
           uint64_t hbits = (bits >> 1) >> (63-nhbits); // shift by
(64-nhbits) that works for nhbits in range [0..63]
           lbits |= (hbits & 1); // assure that tie
rounded to even
           const uint64_t BIT63 = uint64_t(1) << 63;
           if (lbits == BIT63) {
             cpp_bin_float<Digits, DigitBase, Allocator, Exponent, MinE,
MaxE> truncY;
             truncY = bits;
             if (!eval_eq(truncY, y))
               lbits |= 1;
           }
           hbits += (lbits > BIT63);
           ret =
std::ldexp(static_cast<Float>(static_cast<int64_t>(hbits)), e + 1 -
nhbits);
         } else {
           // overflow
           ret = std::numeric_limits<Float>::infinity();
         }
       }
     }
       break;
   }
   *res = original_arg.sign() ? -ret : ret;
}

}}}

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/12527#comment:15>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

Date view	Thread view	Subject view	Author view

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:20 UTC