Re: [Boost-bugs] [Boost C++ Libraries] #3408: gamma distribution's quantile performance

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #3408: gamma distribution's quantile performance
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2009-09-24 11:53:11

Next message: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #3408: gamma distribution's quantile performance"
Previous message: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #2889: Unit Test framework does things that OS X's malloc dislikes"
In reply to: Boost C++ Libraries: "[Boost-bugs] [Boost C++ Libraries] #3408: gamma distribution's quantile performance"
Next in thread: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #3408: gamma distribution's quantile performance"

#3408: gamma distribution's quantile performance
---------------------------------+------------------------------------------
Reporter: chhenning@â€¦ | Owner: johnmaddock
Type: Tasks | Status: assigned
Milestone: Boost 1.41.0 | Component: math
Version: Boost 1.40.0 | Severity: Optimization
Keywords: |
---------------------------------+------------------------------------------

Comment(by johnmaddock):

I thought I'd add a short update: I'm concentrating on the igamma and
it's inverse functions at this stage, as ultimately these are what the
stats functions of both libraries depend upon.

Accuracy wise the two libries are broadly similar:

Tests run with Microsoft Visual C++ version 9.0, Dinkumware standard
library version 505, Win32
Testing tgamma(a, z) medium values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_q<double> Max = 23.69 RMS Mean=3.955
     worst case at row: 600
     { 91.615684509277344, 183.23136901855469, 6.0094745549395681e+125,
2.5194958887453263e-014, 2.3851892681325433e+139, 0.9999999999999748 }

other::gamma_q(double) Max = 125.1 RMS Mean=9.671
     worst case at row: 27
     { 4.053, 405.3, 8.563e-169, 1.334e-169, 6.418, 1 }
boost::math::gamma_p<double> Max = 35.09 RMS Mean=6.961
     worst case at row: 651
     { 96.78564453125, 0.96785646677017212, 3.7245981312945362e+149, 1,
0.00016783328089080209, 4.5060775679568222e-154 }

other::gamma_p(double) Max = 426.3 RMS Mean=46.18
     worst case at row: 518
     { 80.13, 0.8013, 1.566e+117, 1, 1.104e-010, 7.05e-128 }

Testing tgamma(a, z) small values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_q<double> Max = 2.898 RMS Mean=1.05
     worst case at row: 73
     { 1.0221641311147778e-009, 1.0221641311147778e-009,
20.124127878209329, 2.0570161699209098e-008, 978316444.88368392,
0.99999997942983831 }

other::gamma_q(double) Max = 31.86 RMS Mean=7.103
     worst case at row: 223
     { 0.001962, 100, 3.717e-046, 7.303e-049, 509, 1 }
boost::math::gamma_p<double> Max = 0.5133 RMS Mean=0.03234
     worst case at row: 226
     { 0.0055537549778819084, 0.0049983793869614601, 4.6545082571173033,
0.025932342985794662, 174.83209885998627, 0.97406765701420539 }

other::gamma_p(double) Max = 0.7172 RMS Mean=0.2328
     worst case at row: 245
     { 0.05124, 0.0005124, 5.75, 0.3029, 13.24, 0.6971 }

Testing tgamma(a, z) large values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_q<double> Max = 469.8 RMS Mean=31.51
     worst case at row: 222
     { 2057.796630859375, 4115.59326171875, 0.1184802275824791,
5.1589245564542783e-277, 0.22966070987459913, 1 }

other::gamma_q(double) Max = 310.7 RMS Mean=35.14
     worst case at row: 277
     { 7.884e+005, 7.884e+005, 0.1624, 0.499, 0.1631, 0.501 }
boost::math::gamma_p<double> Max = 244.4 RMS Mean=19.38
     worst case at row: 211
     { 1169.2916259765625, 584.64581298828125, 0.22117898880238315, 1,
0.42515811737132952, 1.9222355598668354e-100 }

other::gamma_p(double) Max = 310.3 RMS Mean=36.91
     worst case at row: 277
     { 7.884e+005, 7.884e+005, 0.1624, 0.499, 0.1631, 0.501 }

Testing tgamma(a, z) integer and half integer values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_q<double> Max = 8.485 RMS Mean=1.351
     worst case at row: 138
     { 37, 74, 2.7177865101844111e+035, 7.306008776118165e-007,
3.7199305501125022e+041, 0.99999926939912243 }

other::gamma_q(double) Max = 103.1 RMS Mean=12.59
     worst case at row: 20
     { 4.5, 450, 7.196e-187, 6.187e-188, 11.63, 1 }
boost::math::gamma_p<double> Max = 13.03 RMS Mean=2.932
     worst case at row: 133
     { 37, 0.37000000476837158, 3.7199332678990125e+041, 1,
1.9898558488031992e-018, 5.3491708197417564e-060 }

other::gamma_p(double) Max = 123.4 RMS Mean=20.06
     worst case at row: 91
     { 25, 0.25, 6.204e+023, 1, 2.794e-017, 4.503e-041 }

So for the most part the Boost version can be a bit more accurate.

And for the inverse:

Tests run with Microsoft Visual C++ version 9.0, Dinkumware standard
library version 505, Win32
Testing incomplete gamma inverse(a, z) medium values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_p_inv<double> Max = 1.652 RMS Mean=0.3353
     worst case at row: 2
     { 9.7540397644042969, 0.22103404998779297, 7.2622991072035479,
11.97587613793354 }

boost::math::gamma_q_inv<double> Max = 2.254 RMS Mean=0.3822
     worst case at row: 49
     { 22.103404998779297, 0.96886777877807617, 31.653035526093614,
14.199388115898186 }

other::gamma_p_inv<double> Max = 9.755 RMS Mean=2.085
     worst case at row: 23
     { 13.547700881958008, 0.30816704034805298, 11.480744407137443,
15.117986636057008 }

other::gamma_q_inv<double> Max = 24.23 RMS Mean=2.871
     worst case at row: 49
     { 22.103404998779297, 0.96886777877807617, 31.653035526093614,
14.199388115898186 }

Testing incomplete gamma inverse(a, z) large values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_p_inv<double> Max = 0.9242 RMS Mean=0.1082
     worst case at row: 10
     { 581.3642578125, 0.12698681652545929, 553.96697937820386,
608.96241888374834 }

boost::math::gamma_q_inv<double> Max = 0.8143 RMS Mean=0.1072
     worst case at row: 70
     { 40010.84375, 0.12698681652545929, 39782.764004009827,
40239.124371052443 }

other::gamma_p_inv<double> Max = 1.22 RMS Mean=0.4401
     worst case at row: 88
     { 106978.296875, 0.9133758544921875, 107424.0055670203,
106533.15792214176 }

other::gamma_q_inv<double> Max = 1.07 RMS Mean=0.4135
     worst case at row: 30
     { 3758.09765625, 0.12698681652545929, 3688.2692219050818,
3828.1269667349475 }

Testing incomplete gamma inverse(a, z) small values with type double
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
boost::math::gamma_p_inv<double> Max = 1163 RMS Mean=103.7
     worst case at row: 86
     { 0.000398525211494416, 0.83500856161117554, 1.7877325072629434e-197,
0 }

boost::math::gamma_q_inv<double> Max = 962.6 RMS Mean=82.05
     worst case at row: 82
     { 0.000398525211494416, 0.22103404998779297, 0,
3.4835777677025903e-273 }

other::gamma_p_inv<double> Max = 1229 RMS Mean=108
     worst case at row: 86
     { 0.000398525211494416, 0.83500856161117554, 1.7877325072629434e-197,
0 }

other::gamma_q_inv<double> Max = 1145 RMS Mean=135.6
     worst case at row: 80
     { 0.000398525211494416, 0.12698681652545929, 0,
5.6992492776090963e-149 }

So broadly similar again.

Time wise, I've modified our performance tests to test dcdflib as well and
get:

Testing igamma 8.611e-007
Testing igamma-dcd 5.849e-007
Testing igamma_inv 3.626e-006
Testing igamma_inv-dcd 1.810e-006

which confirms what you're seeing.

After the first round of changes/optimisations (now in Trunk) I get:

Testing igamma 7.067e-007
Testing igamma-dcd 5.849e-007
Testing igamma_inv 3.003e-006
Testing igamma_inv-dcd 1.855e-006

So a bit better, but still some way to go, unfortunately many of the
remaining differences appear to come from either:
* Not such good optimisations when calling boilerplate code (external
"inline" routines) as compared to the declared-all-inline spagetti code of
the original fortran.
* Apparently a lower overhead in calling the functions and going through
the algorithm-selection logic in the dcd code.
However, I'm not particularly keen to replace the current, almost
readable, structured code, with a bunch of goto's and spagetti code :-(

Still looking for more low hanging fruit yours, John.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/3408#comment:5>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

Date view	Thread view	Subject view	Author view

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:01 UTC