Boost logo

Boost :

Subject: Re: [boost] math statistical distribution: multivariate gaussian
From: Thijs van den Berg (thijs_at_[hidden])
Date: 2008-11-19 13:17:37


Stjepan Rajko wrote:
> 2008/11/19 Thijs van den Berg <thijs_at_[hidden]>:
>
>> Hi,
>>
>> I'm about to start implementing a library for multivariate Gaussian
>> distribution. The goal is to build a lib that provides a container for the
>> definition of multivariate Gaussian, as well as implement various identities
>> of that distributions. The identities are things like mean, variance but
>> also conditional distributions, density gradients vector, linear
>> combinations of multivariate Gaussian, Gaussian chain rule etc.
>>
>> I've seen request for something like this in the
>> boost:math/statistical_distributions todo's list, and was wondering if it's
>> possible to team up, or -if nobody is working on it- initiate development.
>>
>>
>
> I would find a boost implementation of a multivariate Gaussian
> distribution very useful. I'm interested in helping.
>
> Best,
>
> Stjepan
>
>
Hi Stjepan,
Great that you want to help!

I see a couple of things that we could start working on, perhaps you
have other/additional idea's!

* define a name for the sandbox folder & create it.

* start a doc in that folder where we collect the details: interfaces,
function, equations, algorithms.
Can that be done in Latex & compiled pdf ? What would be a good doc format?

----
These first two are probably the best way to start... John Maddock 
suggest starting with docs, I agree with that, that should be covered 
with these first two points! More thing that we will need to do are:
* define a list of generic function for generic multivariate densities 
(non member properties) along the lines of  this:
http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/dist/dist_ref/nmp.html
some things that "I" need to implement -as a user- for some other 
project can be seen in this list
http://www.cs.toronto.edu/~roweis/notes/gaussid.pdf
..and there are many more things being used related to multivariate 
Gaussians. E.g. a lot of machine learning project work with multivariate 
Gaussians -they need parameter estimation from data- Some of these 
things might be too specific to add to boost distributions, and could 
fill up a whole "Gaussian lib" in itself! I don't know.
We might also look at other mathematic packages like Matlab, R, Octave 
to see what they do with multivariate distributions.
* using that list, we will see what type of matrix operators we will 
need, and that will allow us think about either between making a 
dependency to ublas & other, or keep it void from external dependencies 
& implement it ourselves.
* We should also collect info on smart numerical solutions for specific 
multivariate Gaussians. E.g. I know of good 2d and 3d approximations of 
the cdf, but for higher dimensions we need to do numerical 
approximations. We can collect information on what to do with those 
cases in doc.
Cheers,
Thijs


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk