Boost logo

Boost Users :

Subject: Re: [Boost-users] boost::math::standard_deviation how to use ?
From: Paul A. Bristow (pbristow_at_[hidden])
Date: 2012-08-19 14:50:07


> -----Original Message-----
> From: boost-users-bounces_at_[hidden] [mailto:boost-users-bounces_at_[hidden]] On Behalf
Of
> Edouard Tallent
> Sent: Saturday, August 18, 2012 6:53 PM
> To: boost-users_at_[hidden]
> Subject: Re: [Boost-users] boost::math::standard_deviation how to use ?
>
> My 2-cent
> One can easily compute the standard deviation of data contained in a std::vector as shown here:
> http://stackoverflow.com/questions/7616511/calculate-mean-and-standard-deviation-from-a-vector-of-
> samples-in-c-using-boost

These are textbook algorithms, but assume that there are no outliers (and a normal distribution).

Including outliers (usually very small or very big values) will produce highly misleading standard
deviation (and any other statistical inferences that you make).

Garbage in - garbage out ;-)

So unless you absolutely certain that there are no 'dud' items, you should always detect and remove
outliers before finally calculating a standard deviation.

Michael Frigge, David C. Hoaglin and Boris Iglewicz
        The American Statistician, Vol. 43, No. 1 (Feb., 1989), pp. 50-54
        Tukey, J. W. Exploratory Data Analysis, Addison Wesley (1977, p 33)

       "Some Implementations of the Boxplot"

include several versions of outlier marking for boxplots.

https://svn.boost.org/svn/boost/sandbox/SOC/2007/visualization/boost/svg_plot/svg_boxplot.hpp

from Jake Voytko's Googe summer of Code project still in the Boost Sandbox will give you some ideas
on how you can code simple outlier removal.

Finally, bear in mind the very large uncertainty of standard deviations computed from a few values.

http://www.boost.org/doc/libs/1_50_0/libs/math/doc/sf_and_dist/html/math_toolkit/dist/stat_tut/weg/c
s_eg/chi_sq_intervals.html

see the table

Confidence intervals as a function of the number of observations

should make you sceptical of blind calculations.

HTH

Paul

---
Paul A. Bristow,
Prizet Farmhouse, Kendal LA8 8AB  UK
+44 1539 561830  07714330204
pbristow_at_[hidden]

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net