Boost logo

Boost :

From: Gennadiy Rozental (gennadiy.rozental_at_[hidden])
Date: 2007-02-16 20:29:11


"Eric Niebler" <eric_at_[hidden]> wrote in message
news:45D611ED.1050300_at_boost-consulting.com...
>
>
> Gennadiy Rozental wrote:
>> Hi,
>>
>> I did not have a change to participate in a review because I was busy
>> at
>> the time(as it is usual lately). I did like the submission when it
>> originally appeared. And now after reading docs some more I could only
>> congratulate Eric for yet another good addition to the boost.
>
>
> Thanks!
>
>
>> Being the author of similar (though obviously not that powerful)
>> component
>> myself I find two featured omitted. I must admit that I may've easily
>> missed
>> something. But here we go.
>>
>> 1. accumulator composition
>>
>> It would be a pity if I need to write new accumulator any time I need to
>> combine features implemented by two existing one. For example
>>
>> min/max average value
>> min/max change rate
>
>
> Sorry, I don't understand. A data series has one average. What is the
> max average?

You seems to be looking on this from the phisical experiment prospective. I
mostly dealing with realtime data. I track samples of particular vaiable
over period of time. Both the variable itself and all statistics derived
from the variable samples are function of time. The value of statistic (in
above example it's avarage) is a derived variable. For which I could apply
another statistic (in above example I want to track it's maximum and
minimum). This is what I mean by accumulator composition. Very similar to
function composition. Average(v) is a function of v. Max(x) is a function of
Max(Average(v)) is a composition.

>> 2. timing policy.
>> In many of my real life projects that require some statistical value it
>> almost always need to be combined with the time of the event. For example
>>
>> when particular value reached it's maximum?
>> when particular value reached it's minimum?
>> when particular value was last changed?
>>
>> or more specific:
>>
>> When 10 sec throughput average reached it's maximum?
>>
>> similar idea could be applied to any statistics that model some extreme
>> value. IMO framework should support this in a form of timing policy
>> somewhere.
>
>
> You could implement this with accumulators either using covariate data,
> where the times are covariate with the samples, or by using a std::pair<
> sample, time > as the value type of the accumulator, and defining an
> appropriate sort criterion.

I am not an expert in your library. So could you present some working
example? One note though. time is different than weight. Weight should be
supplied along with each value. An accumulator should be able deduce current
time by itself without user's involvement (based on some timing policy).
Unless I misunderstand things it should be enough for you to support policy
based last_update_time accumulator and an ability to compose the
accumulators I refer in item 1 to solve all timing needs. last_update_time
keeps track of last time sample being added to the set. To get time variable
reached it's maximum I would create a composition from last_update_time and
max accumulators

>> Another general comment. I personally would find single changing variable
>> oriented interface more convenient and ore widely applicable (as opposed
>> to
>> the samples set). Variable could change in many ways (not only addition
>> or
>> subtraction, and even those could be done more conveniently with
>> operator
>> overloading). Essentially what I am looking for is something like this:
>>
>> tracked_var<....> v;
>>
>> v += 10;
>>
>> int i = v +1;
>>
>> v -= 5;
>>
>> v *= 2;
>>
>> cout << average( v );
>> cout << max( v );
>> cout << min( v );
>>
>> cout << max( average( v ) ) << " @" min( average( v ) ).time();
>
>
> Interesting. Each mutating operation on v is considered a new sample?
> This is a less powerful interface (no way to express covariate data;

not nesseserily

v += make_pair(5,2)

should work. O nthe other handthis interface allows to refer to the "current
value" of the variable to simplify operation performed with it.

> eg., where is the time of each sample specified?),

No. I expect time to be collected automatically.

> but might be cleaner
> for some applications. It would be pretty simple to implement such an
> interface on top of accumulator_set.

I would like to see this as part of your framework.

Gennadiy


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk