Re: [Boost-bugs] [Boost C++ Libraries] #12688: Boost..Accumulators test/median.cpp testing method is flawed

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #12688: Boost..Accumulators test/median.cpp testing method is flawed
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-12-14 17:43:24


#12688: Boost..Accumulators test/median.cpp testing method is flawed
-------------------------------------+-------------------------------------
  Reporter: A. Sinan Unur | Owner: eric_niebler
  <sinan@…> | Status: new
      Type: Bugs | Component: accumulator
 Milestone: To Be Determined | Severity: Problem
   Version: Boost 1.62.0 | Keywords: testing, median,
Resolution: | algorithm, accumulator
-------------------------------------+-------------------------------------

Comment (by A. Sinan Unur <sinan@…>):

 This is the final actually: As I was verifying intermediate calculations
 by hand, I noticed that the body of the paper lists the first five
 observations as

     0.02, 0.5, 0.74, 3.99, 0.83

 whereas they use

     0.02, 0.15, 0.74, 3.99, 0.83

 to produce Table I.

 With that adjustment, the program:

 {{{
 #include <algorithm>
 #include <cstdio>
 #include <string>
 #include <utility>
 #include <vector>
 #include <boost/accumulators/accumulators.hpp>
 #include <boost/accumulators/statistics/stats.hpp>
 #include <boost/accumulators/statistics/median.hpp>

 namespace bacc = boost::accumulators;

 int main(void)
 {
     bacc::accumulator_set<double,
         bacc::stats<bacc::tag::median(bacc::with_p_square_quantile)> >
 acc;

     // See http://www.cse.wustl.edu/~jain/papers/psqr.htm

     // First five observations
     acc(0.02);
     acc(0.15);
     acc(0.74);
     acc(3.39);
     acc(0.83);

     const std::vector<std::pair<double, double> > jain_chlamtac {
         {22.37, 0.74},
         {10.15, 0.74},
         {15.43, 2.18},
         {38.62, 4.75},
         {15.92, 4.75},
         {34.60, 9.28},
         {10.28, 9.28},
         {1.47, 9.28},
         {0.40, 9.28},
         {0.05, 6.30},
         {11.39, 6.30},
         {0.27, 6.30},
         {0.42, 6.30},
         {0.09, 4.44},
         {11.37, 4.44},
     };

     for (auto p: jain_chlamtac)
     {
         acc(p.first);
         std::printf("calculated= %.3f\texpected= %.2f\n",
 bacc::median(acc), p.second);
     }

     return 0;
 }
 }}}

 produces the output:

 {{{
 calculated= 0.740 expected= 0.74
 calculated= 0.740 expected= 0.74
 calculated= 2.178 expected= 2.18
 calculated= 4.753 expected= 4.75
 calculated= 4.753 expected= 4.75
 calculated= 9.275 expected= 9.28
 calculated= 9.275 expected= 9.28
 calculated= 9.275 expected= 9.28
 calculated= 9.275 expected= 9.28
 calculated= 6.297 expected= 6.30
 calculated= 6.297 expected= 6.30
 calculated= 6.297 expected= 6.30
 calculated= 6.297 expected= 6.30
 calculated= 4.441 expected= 4.44
 calculated= 4.441 expected= 4.44
 }}}

 which is definitely good enough.

 I sent an email to Dr. Jain pointing out the discrepancy between the text
 of the paper and the numbers used to produce Table I.

 This removes my concern that there was something subtly wrong with the
 implementation in `p_square_quantile.hpp` (I did re-write the key sections
 several times trying to figure out what it might be) and establishes that
 the implementation can replicate the output in the original paper.

 I still think the test needs to be improved, but now I feel free to focus
 on that.

 I will submit a pull request as soon as I have some time to put it
 together.

 HTH,

 -- Sinan

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/12688#comment:2>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:20 UTC