# Boost :

From: Paul A Bristow (pbristow_at_[hidden])
Date: 2006-07-11 11:02:55

| -----Original Message-----
| From: boost-bounces_at_[hidden]
| [mailto:boost-bounces_at_[hidden]] On Behalf Of Deane Yang
| Sent: 11 July 2006 15:11
| To: boost_at_[hidden]
| Subject: Re: [boost] [math/staticstics/design] How best to
| namestatisticalfunctions?
|
| So let's use the Students T distribution as an example. The
| Students T
| distribution is a *family* of 1-dimensional distributions
| that depend on a single parameter, called "degrees of freedom".

Does the word *family* implies integral degrees of freedom?
Numerically, and perhaps conceptually, it isn't - it's a continuous real.
So could one also regard it as a two parameter function f(t, v) ?
However I don't think this matters here.

| Given a value, say, D,
| for the degrees of freedom, you get a density function p_D and
| integrating it gives you the cumulative density function P_D.

| As I mentioned before, these should be member functions,
| which could be called "density" (also called 'mass')

| and "cumulative".

OHOH many books don't mention either of these words!

The whole nomenclature seems a massive muddle,
with mathematicians, statistics, and users or all sorts using different
terms
and everyone thinks they are the 'Standard' :-(

And the highest priority in my book is the END USERS,
not the professionals.

| The cumulative density function is a strictly increasing
| function and
| therefore can be inverted. The inverse function could be called
| "inverse_cumulative", which is a completely unambiguous name.

But excessively long :-(

| I would say that these three member functions should be
| common to all
| implemented distributions. Other common member functions
| might include
| "mean", "variance", and possibly others.

Median, mode, variance, skewness, kurtosis are common given, for example:

http://en.wikipedia.org/wiki/Student%27s_t

| Finally, you observe that it is often useful to specify the
| cumulative
| probability for a given value of the random variable and
| solve for the
| parameter (the "degrees of freedom" for a Students T
| distribution) that
| determines the distribution. Since each family of
| distributions depends
| on a different set of parameters (for example, normal distributions
| depend on two parameters, the mean and variance), the
| interface for this is trickier to define.

| I can think of two possibilities (I prefer the first):
|
| 1) Define ad hoc inverse functions for each specific
| distribution. So
| for the Students T distribution, you would define a member
| function of the form:
|
| double degrees_of_freedom(double cumulative_probability, double
random_variable) const;

I don't like 2 either, so I have snipped it ;-)

This seems OK to me.

I'd be grateful if you could sketch out how you see the whole Student's t
class would look (just for double and omit the equations of course).

However:

But I still worried that the whole scheme will lead to much bigger code
compared to a set of names of (template) functions
(because code that isn't in fact used will be generated).
```---