Boost logo

Boost :

Subject: Re: [boost] [math] Empirical distribution function
From: Denis Arnaud (denis.arnaud_boost_at_[hidden])
Date: 2011-06-15 08:47:47


(please disregard my previous message: the subject was not the correct one)

Date: Wed, 15 Jun 2011 06:19:49 -0400, From: er <er.ci.2020_at_[hidden]>

> I developed one as a Boost.Accumulator here:
>
https://svn.boost.org/svn/boost/sandbox/statistics/non_parametric/boost/statistics/detail/non_parametric/
>
> While it's been a while I haven't touched these directories, I'd be
> happy to do some maintenance & a proper doc/test suite.

> It is the ECDF assuming independent sampling (I think identically need
> not even be assumed) and derived quantities (such as Kolmogorov Smirnov
> statistic) :
> F(x) = count of samples below x
> The fact that it's an accummulator is just a convenience : the
> distribution is updated each time a sample is passed to the acc.
> I'm doing maintenance work right not, whether or not this matches the
> need of Denis. It shouldn't take too much time (days).

That seems to fit my (pretty simple) need. I'm still trying to test it.
Indeed, I would find it more logical to have an empirical distribution
"function" <http://en.wikipedia.org/wiki/Empirical_distribution_function>part
of the set of already existing
Boost.Math statistical
distributions<http://www.boost.org/doc/libs/1_46_1/libs/math/doc/sf_and_dist/html/math_toolkit/dist.html>.
But an accumulator is just fine for now. By the way, does it mean that if we
have a set of 1 million of distinct real numbers, the corresponding
accumulator will store a pretty big number of counters?

Denis


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk