Boost :

Date view	Thread view	Subject view	Author view

From: Johan Råde (rade_at_[hidden])
Date: 2008-04-24 12:49:46

Next message: Bjørn Roald: "Re: [boost] ABI and API compatibility"
Previous message: Andrey Semashev: "Re: [boost] ABI and API compatibility"
In reply to: John Maddock: "Re: [boost] [math] Summer of Code"
Next in thread: Johan Råde: "Re: [boost] [math] Summer of Code"

John Maddock wrote:
> Johan Råde wrote:
>> A typical data mining scenario might be to calculate the cdf for the
>> t- or F-distribution for
>> each value in an array of say 100,000 single or double precision
>> floating point numbers.
>> (I tend to use double precision.)
>> Anything that could speed up that task would be interesting.
>
> Nod, the question is what the actual combination of arguments that get
> passed to the incomplete beta are: if the data isn't unduely sensitive, what
> would be really useful is to have a log of those values so we can see which
> parts of the implementation are getting hammered the most.

In data mining applications, usually most of the variables satisfy the null hypothesis.
In other words, their distribution is given by the t- or F-distribution at hand.
So you can generate test data by starting with an array of uniform [0,1] random numbers,
and then you apply the inverse of the cdf to each value.

Concerning the degrees of freedom, in the problems we analyze:
For the t-distribution, 10 - 1000, typically around 100.
For the F distribution, the first number of degrees of freedom: 2 - 10
the second number of degrees of freedom: 10 - 1000, typically around 100.

HTH
Johan Råde

Next message: Bjørn Roald: "Re: [boost] ABI and API compatibility"
Previous message: Andrey Semashev: "Re: [boost] ABI and API compatibility"
In reply to: John Maddock: "Re: [boost] [math] Summer of Code"
Next in thread: Johan Råde: "Re: [boost] [math] Summer of Code"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk