Boost logo

Boost :

From: Paul A Bristow (pbristow_at_[hidden])
Date: 2006-07-12 05:11:24


| -----Original Message-----
| From: boost-bounces_at_[hidden]
| [mailto:boost-bounces_at_[hidden]] On Behalf Of John Maddock
| Sent: 11 July 2006 17:26
| To: boost_at_[hidden]
| Subject: Re: [boost] [math/staticstics/design] How
| besttonamestatisticalfunctions?
|
| >> As I mentioned before, these should be member functions,
| >> which could be called "density" (also called 'mass')
|
| Or distribution :-)

This seems quite clear to me - both density and mass sound too physical to
me,
though they are in common use.

What is important is that the documentation gives ALL the other possible
names.

| >> The inverse function could be called "inverse_cumulative"
| > But excessively long :-(
| True, how about "persentile", or is that to ambiguous?

Percentile might be better - it is in the dictionary ;-))

But quantile is a more modern term and doesn't raise any questions about
multiplying /dividing by/with 100, a source of unnecessary confusion - as we
have found with Boost.Test.

So I'm strongly in favour of quantile.

But I also wonder if 'fraction' is a possible name?
 
| >> 1) Define ad hoc inverse functions for each specific
| >> distribution. So
| >> for the Students T distribution, you would define a member
| >> function of the form:
| >>
| >> double degrees_of_freedom(double cumulative_probability, double
| >> random_variable) const;
|
| That could be a static member function, since we're solving
| for the degrees of freedom parameter.

OK

| It would also be more natural to me for the
| cumulative_probability parameter to come last in the list.

Why? Quantile is also cumulative?

| > But I still worried that the whole scheme will lead to much bigger
| > code compared to a set of names of (template) functions
| > (because code that isn't in fact used will be generated).
|
| For template classes member functions are only instantiated
| when used, so if
| you only use one member, then that's the only one instantiated.

What that's what I thought - but I wanted expert reassurance before driving
into a dead-end ;-)

So my worry turns into a killer feature - keeping the cost of calling a
single student's t down to reasonable levels is crucially important.

Compared to linking to a "All_the_stats_functions_you_could_ever_want'.dll
it should be easily 'affordable', as they say.

Which also means that the cost of a Q or complement function is nothing
unless you use it.
(and you probably won't use the P version as well).

>> In other words, 1 - P. Right? One response is why do you
>> need to define
>> it, given how easy it is to get from the cumulative density
>> function?
> Perhaps not really needed? Is there an accuracy reason for both?

| It depends how accurate you want to be: calculating 1-P incurs
cancellation
| error if P is very near 1, where as for most (all?) distributions we can
| calculate Q directly without the subraction from unity.

| I think the "Boostified" name would be in all lower case: students_t or
whatever.

Agree with this.

Paul

---
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
+44 1539561830 & SMS, Mobile +44 7714 330204 & SMS
pbristow_at_[hidden]
 

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk