Boost logo

Boost :

Subject: Re: [boost] [Autosave] Re: [math][accumulators] Empirical distribution function
From: er (er.ci.2020_at_[hidden])
Date: 2011-08-08 16:37:42


On 8/8/11 7:06 PM, Simon West wrote:
> Hi,
>
> On Sun, June 19, 2011 22:36, er wrote:
>> Hope this can serve as a basis for a conversation:
>>
>>
>> https://svn.boost.org/svn/boost/sandbox/acc_ecdf/
>>
> I'm assisting Eric with the maintenance of Accumulators. I've had a look
> through the code at the above link, and would like to offer the following
> comments (if I have misunderstood anything, please let me know).

Thanks for following up.
> My basic concern with the code is that a map is used to store the counts of
> data-points that have been added (the map keys are the data-points, the
> map values are the counts). In real-world floating point data it is rare
> for two data-points to be exactly the same, so in practice the map would
> have a single key-value pair for each data-point q_i, of the form
> (key=q_i,value=1). This is inefficient, because all the key values will
> be 1. Also, the memory usage will grow linearly with the number of
> data-points accumulated, which doesn't seem to be in keeping with the
> spirit of the Accumulators library.
>
> For these reasons, I'm not convinced that the code should be added to the
> library in its current state.
Thanks, but it was just, I quote, a "basis for a conversation" at the
request of a user (Denis Arnaud), but it failed to go anywhere at the time.
Yes, I realize this is not the spirit of Accumulators, which is to
iteratively compute statistics, such that memory usage is fixed given
the number of features. This would therefore only be suitable for a
distribution whose domain is finite. I had some thought of merging this
idea with another that I gave a shot at a while back (look for
chi-square table in boost.users), but which in hindsight I'd do a bit
differently. So for now, not much to add, and thanks.

> Simon.
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk