Boost logo

Boost Users :

Subject: Re: [Boost-users] converting to float128 from double
From: John Maddock (jz.maddock_at_[hidden])
Date: 2016-05-04 11:49:07


On 04/05/2016 13:47, Cooper, Bridgette R D wrote:
> Hi John,
>
> I tried again. I compile pretty much exactly the function you wrote:

Bridgette, without the input values I still can't go about reproducing
here - I need a *complete* test case plus the matching FORTRAN code.

John.

>
> #include <boost/multiprecision/float128.hpp>
> #include <boost/math/cstdfloat/cstdfloat_types.hpp>
> #include <boost/math/cstdfloat/cstdfloat_limits.hpp>
> #include <boost/math/cstdfloat/cstdfloat_cmath.hpp>
> #include <boost/math/cstdfloat/cstdfloat_iostream.hpp>
> extern "C"{
> #include <quadmath.h>
> }
> #include <iostream>
> #include <fstream>
> #include <sstream>
> #include <vector>
> #include <iterator>
>
> using namespace std;
> using namespace boost::multiprecision;
> double stieltjes_gamma(const std::vector<double> & e8, const std::vector<double> & g8, int printlevel)
> {
> assert(e8.size() == g8.size()); // precondition?
> std::vector<float128> epoint(e8.begin(), e8.end()),
> gpoint(g8.begin(), g8.end()), bcoef(e8.size());
>
> bcoef[0] = 0.0Q;
> for(int i = 0; i < e8.size(); i++)
> {
> bcoef[0] += gpoint[i];
> }
> std::cout <<
> std::setprecision(std::numeric_limits<float128>::max_digits10) <<
> bcoef.at(0) << std::endl;
> return (0.0);
> }
>
> The answer is still:
>
> 0.0345217724606016853085032291117978629
>
> Whereas the fortran gives the answer to be:
>
> 3.452175962187407162907246287483402E-0002
>
> I really appreciate you taking the time to respond to my emails. I really don't understand the discrepancy.
>
> I have also tried things like:
>
> epoint.at(i)=static_cast<float128>(e8.at(i));
>
> But still cannot reproduce the fortran value.
>
> I also have code that can reproduce the quad precision sqrt to match fortran code. Firstly, I have a c++ routine that calculates the sqrt (a^2+b^2) in quad precision without using the library. This reproduces the quad value from fortran, and the quad value of sqrt function using boost library. However, if the explicit type of a or b is not appended with Q I don't reproduce the correct answer, which I can understand. But what if I wanted to start with a double value of a and b, cast them up to new variables c d in float128, and want the square root of the quad numbers. How do I do this casting so that I don't get just the result casted up to a quad precision number, the sqrtq function is called?
>
> Thanks again for all your help.
> ________________________________________
> From: Boost-users <boost-users-bounces_at_[hidden]> on behalf of John Maddock <jz.maddock_at_[hidden]>
> Sent: Wednesday, May 4, 2016 12:29:31 PM
> To: boost-users_at_[hidden]
> Subject: Re: [Boost-users] converting to float128 from double
>
> On 03/05/2016 20:58, Cooper, Bridgette R D wrote:
>> Hi John,
>>
>> No, my code is several hundreds of lines long. I extracted what I thought would be quite self contained, but yeah you are right about the return type mis-match.
>>
>> I am not sure that this helps. What I really want is to do a lot of computation in float128 precision, and for some reason when I cast up a double vector to a float128 vector all the operations on the float128 vector are happening a double precision, and the cast up only happens at the final stage.
>>
>> I've changed the initiation of epoint and gpoint vectors to the same as your code snippet and still the same problem.
>>
>> So for example, if I accumulate the original double vector without casting the values to float128 type and accumulate into a float128 variable, the cast up only happens to the result. This is the same number that I see as the result of accumulating the float128vector values.
>>
>> Hope this clarifies things a bit
> Not really :(
>
> I think you're going to have to try to produce a self contained test
> case - my guess is you'll spot the error in the process of doing that?
>
> Best, John.
>
>> ________________________________________
>> From: Boost-users <boost-users-bounces_at_[hidden]> on behalf of John Maddock <jz.maddock_at_[hidden]>
>> Sent: Tuesday, May 3, 2016 6:27:39 PM
>> To: boost-users_at_[hidden]
>> Subject: Re: [Boost-users] converting to float128 from double
>>
>> On 03/05/2016 17:35, Cooper, Bridgette R D wrote:
>>> Hi John,
>>>
>>> Thanks for the response.
>>>
>>> Here are some more snippets from my code.
>> I don't see any obvious errors, but in any case you must have extracted
>> that from something else but it doesn't compile!
>>
>> Not least you can't implicitly return from a float128 to a double in
>>
>> return(bcoef.at(0))
>>
>> So if that's compiling for you, then I suspect your declaration of bcoef.
>>
>> If I re-write and simplify in a slightly more C++ idiom how does it
>> compare to your fortran code now:
>>
>> double stieltjes_gamma(const std::vector<double> & e8, const
>> std::vector<double> & g8, int printlevel)
>> {
>> assert(e8.size() == g8.size()); // precondition?
>> std::vector<float128> epoint(e8.begin(), e8.end()),
>> gpoint(g8.begin(), g8.end()), bcoef(e8.size());
>>
>> bcoef[0] = 0.0Q;
>> for(int i = 0; i < e8.size(); i++)
>> {
>> bcoef[0] += gpoint[i];
>> }
>> std::cout <<
>> std::setprecision(std::numeric_limits<float128>::max_digits10) <<
>> bcoef.at(0) << std::endl;
>> return static_cast<double>(bcoef[0]);
>> }
>>
>> And finally... note that summing at higher precision can not actually
>> protect you from cancellation error (in any language) since the error is
>> inherent in the input values.
>>
>> HTH, John.
>>
>>
>>> #include <boost/multiprecision/float128.hpp>
>>> #include <boost/math/cstdfloat/cstdfloat_types.hpp>
>>> #include <boost/math/cstdfloat/cstdfloat_limits.hpp>
>>> #include <boost/math/cstdfloat/cstdfloat_cmath.hpp>
>>> #include <boost/math/cstdfloat/cstdfloat_iostream.hpp>
>>> #include <boost/multiprecision/detail/float_string_cvt.hpp>
>>> #include <boost/multiprecision/detail/generic_interconvert.hpp>
>>> extern "C"{
>>> #include <quadmath.h>
>>> }
>>> #include <iostream>
>>> #include <sstream>
>>> #include <vector>
>>> #include <iterator>
>>>
>>> using namespace boost::multi precision;
>>>
>>>
>>> double stieltjes_gamma(int num, const std::vector<double> & e8, const std::vector<double> & g8, int printlevel )
>>> {
>>>
>>> std::vector<float128> epoint(num), gpoint(num), bcoef(num);
>>>
>>>
>>>
>>> for (int i=0; i < num; i++)
>>> {
>>> epoint.at(i)=(float128(e8.at(i)));
>>> gpoint.at(i)=(float128(g8.at(i)));
>>> }
>>>
>>> bcoef.at(0)=0.0Q;
>>> for (int i = 0; i < num; i++)
>>> {
>>> bcoef.at(0)+=gpoint.at(i)
>>> }
>>> std::cout << std::setprecision(std::numeric_limits<float128>::max_digits10) << bcoef.at(0)<< std::endl;
>>> return(bcoef.at(0))
>>> }
>>>
>>> When I compare this to fortran code that does the same I don't see the same value for bcoef.at(0).
>>>
>>> 0.0345217724606016853085032291117978629 <- from above c++ code
>>> 3.452175962187407162907246287483402E-0002 <- Fortran correctly utilising quad precision on same input vector
>>>
>>> The input vectors in this case are 7976 so errors in precision are really quickly apparrant. If I print the gpoint vector, it gets filled with the same garbage for additional precision from double as is for the fortran.
>>>
>>> Thanks,
>>> ________________________________________
>>> From: Boost-users <boost-users-bounces_at_[hidden]> on behalf of John Maddock <jz.maddock_at_[hidden]>
>>> Sent: Tuesday, May 3, 2016 5:06:47 PM
>>> To: boost-users_at_[hidden]
>>> Subject: Re: [Boost-users] converting to float128 from double
>>>
>>> On 03/05/2016 13:44, Cooper, Bridgette R D wrote:
>>>> Hi,
>>>>
>>>>
>>>> I have a function that takes as an argument a vector of doubles and
>>>> tries to convert it to a vector of float128.
>>>>
>>>>
>>>> When I accumulate the values in the float128 vector it looks like it
>>>> does double additions and only casts up at the last step.
>>>>
>>>>
>>>> The initial vector is defined as conststd::vector<double>
>>>>
>>>> The float128 vector is defined as std::vector<float128> epoint(num)
>>>>
>>>>
>>>> And I try to do the casting (inside a loop) via:
>>>> epoint.at(i)=(float128(e8.at(i)));
>>>>
>>>>
>>>> If I do the accumulation on the original vector with no casting into a
>>>> float128, I get the same as accumulating the vector that should be of
>>>> float128 type.
>>>>
>>>>
>>>> How should I be doing the casting?
>>>>
>>> Sorry but you'll have to post way more code than that to see where the
>>> mistake is.
>>>
>>> John.
>>>
>>>> Thanks,
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Boost-users mailing list
>>>> Boost-users_at_[hidden]
>>>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>> _______________________________________________
>>> Boost-users mailing list
>>> Boost-users_at_[hidden]
>>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>> _______________________________________________
>>> Boost-users mailing list
>>> Boost-users_at_[hidden]
>>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net