Subject: Re: [boost] math statistical distribution: multivariate gaussian
From: Thijs van den Berg (thijs_at_[hidden])
Date: 2008-11-19 16:06:49
Stjepan Rajko wrote:
> 2008/11/19 Thijs van den Berg <thijs_at_[hidden]>:
>> I see a couple of things that we could start working on, perhaps you have
>> other/additional idea's!
>> * define a name for the sandbox folder & create it.
> How about sandbox/multivariate_distributions ?
> If this is geared towards Boost.Math, we should probably consider that
> library's directory structure. Currently, the statistical
> distributions files appear to be placed as follows:
> include files: ..../boost/include/math/distributions
> docs: ..../libs/math/distributions/doc/sf_and_dist
> tests: ..../libs/math/distributions/test
> examples: ..../libs/math/distributions/example
> I'm not sure whether we should re-use all of those directories or
> change some (or all) of them.
I agree, let's keep it simple for a start-we are in the sandbox- put the
doc & code directly in sandbox/multivariate_distributions? We can also
split things up once we have a start
>> * start a doc in that folder where we collect the details: interfaces,
>> function, equations, algorithms.
>> Can that be done in Latex & compiled pdf ? What would be a good doc format?
> I would recommend quickbook, like John suggested. I can set up the
> basic files for a starting docs build once we decide on the directory
I like quickbook too, I'm almost ready installing the generation tools.
It would be great if you could set up the basic files!
>> These first two are probably the best way to start... John Maddock suggest
>> starting with docs, I agree with that, that should be covered with these
>> first two points! More thing that we will need to do are:
>> * define a list of generic function for generic multivariate densities (non
>> member properties) along the lines of this:
> I'd suggest following John's suggestion in starting with the subset of
> that list that applies to multivariate distributions. If we start
> adding things and this ends up in Boost.Math, then the same things we
> add for multivariate distributions should also probably be added for
> the univariate distributions (if they apply) for consistency.
very good, but there will indeed be some things that only apply to
>> some things that "I" need to implement -as a user- for some other project
>> can be seen in this list
>> ..and there are many more things being used related to multivariate
>> Gaussians. E.g. a lot of machine learning project work with multivariate
>> Gaussians -they need parameter estimation from data- Some of these things
>> might be too specific to add to boost distributions, and could fill up a
>> whole "Gaussian lib" in itself! I don't know.
> I think parameter estimation from data would be a very useful thing to
> add, but if we do we should keep all distributions in mind.
Yes, and that a big extension! For consistency, this would imply that
all current univariate distributions would also need param estrimation.
Another feature used a lot (in code) is to draw random samples from
distributions. Anther possibility I see is to keep some of those things
ouside this scope, and put it somewhere else.
>> We might also look at other mathematic packages like Matlab, R, Octave to
>> see what they do with multivariate distributions.
>> * using that list, we will see what type of matrix operators we will need,
>> and that will allow us think about either between making a dependency to
>> ublas & other, or keep it void from external dependencies & implement it
> I think the concept-based way is the way to go. We can let the user
> provide the matrix type, as long as it provides the operations we
> need. Maybe we can use ublas matrices as the default type if it is
> sufficient, since that is already in boost (and header-only), and
> maybe test with some other libraries just to make sure we're not
> requiring syntax that is too ublas-specific.
Yes I agree!
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk