Boost logo

Boost-Commit :

Subject: [Boost-commit] svn:boost r56588 - in sandbox/statistics/kernel/libs/statistics/detail: . kernel kernel/doc kernel/example kernel/src kernel/test
From: erwann.rogard_at_[hidden]
Date: 2009-10-04 18:56:16


Author: e_r
Date: 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
New Revision: 56588
URL: http://svn.boost.org/trac/boost/changeset/56588

Log:
a
Added:
   sandbox/statistics/kernel/libs/statistics/detail/
   sandbox/statistics/kernel/libs/statistics/detail/kernel/
   sandbox/statistics/kernel/libs/statistics/detail/kernel/doc/
   sandbox/statistics/kernel/libs/statistics/detail/kernel/doc/readme.txt (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/benchmark_scalar.cpp (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/benchmark_scalar.h (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/mv_mono_bw_rp.cpp (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/mv_mono_bw_rp.h (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_nw.cpp (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_nw.h (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_rp.cpp (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_rp.h (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/src/
   sandbox/statistics/kernel/libs/statistics/detail/kernel/src/main.cpp (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/test/
   sandbox/statistics/kernel/libs/statistics/detail/kernel/test/scalar_nw.cpp (contents, props changed)
   sandbox/statistics/kernel/libs/statistics/detail/kernel/test/scalar_nw.h (contents, props changed)

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/doc/readme.txt
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/doc/readme.txt 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,152 @@
+//////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::doc::readme //
+// //
+// (C) Copyright 2009 Erwann Rogard //
+// Use, modification and distribution are subject to the //
+// Boost Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+//////////////////////////////////////////////////////////////////////////////
+
+[ Contact ]
+
+erwann.rogard_at_[hidden]
+
+[ Overview ]
+
+These are C++ classes for kernel based density estimation and conditional mean
+estimation by the method of Rosenblatt-Parzen and Nadaraya-Watson, respectively.
+
+[ Compiler ]
+
+gcc version i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1
+
+[ Dependencies ]
+
+boost_1_39_0
+/sandbox/statistics/vector_space/
+/sandbox/statistics/arithmetic/
+/sandbox/statistics/mpl/
+...TODO
+
+
+[ Sources ]
+
+http://en.wikipedia.org/wiki/Kernel_(statistics)
+http://en.wikipedia.org/wiki/Kernel_density_estimation
+@book{citeulike:300228,
+ author = {Silverman, B. W.},
+ howpublished = {Hardcover},
+ isbn = {0412246201},
+ keywords = {density-estimation},
+ month = {April},
+ posted-at = {2008-01-13 17:43:25},
+ priority = {2},
+ publisher = {{Chapman \& Hall/CRC}},
+ title = {Density Estimation for Statistics and Data Analysis},
+ year = {1986}
+}
+
+[ History ]
+
+October 3rd 2009 :
+ - Replaced namespace statistics by statistics::detail
+ - Finer directory structure
+ - nw_visitor_tuple replaced by nw_visitor_unary which no longer constrains
+ the training data to be a tuple (our preference is a fusion::map).
+ - added cross_validation
+
+July 2009 : Creation
+March 2009: : improved_fast_gauss_transform (now deprecated)
+
+[Design]
+
+Let h denote bandwidth, d dimension, ||.|| norm.
+
+To perform density or conditional mean estimation, we are interested in
+operations that involve an abitrary bandwidth binary kernel, K[h](x0,x1),
+and/or possibly its mono_bw_kernelvariate extension K[d,h](x0,x1).
+
+Our design reflects the nested structure of the mathematical pertaining to
+kernels:
+ - standardized vs scaled by a bandwidth
+ - unnormalized vs normalized
+ - unary kernels, K(x) vs binary kernels, K(x0,x1)
+ - univariate vs multivariate
+
+
+[ Directories ]
+
+/bandwidth_selection
+
+ Analytical and cross validation bandwidth selectors
+
+/estimation
+
+ The classes rp_visitor and nw_visitor implement the Rosenblatt-Parzen and
+ Nadaraya-Watson estimators for a given test value, x.
+
+ The class estimator automates the task of visiting a dataset
+
+/kernels
+ /scalar
+ The client need only specify the standardized kernel, such as
+ scalar::gaussian_kernel, while a crtp mechanism, implemented in
+ scalar::crtp, adds the remaining functionality.
+
+ /multivariate
+
+ Maps a unary kernel to a multivariate one.
+
+[ Math representation ]
+
+Univariate
+ Unary:
+ Unnormalized: Ku(x) : [0,inf) -> [0,inf)
+ Normalized: K(x) : [0,inf) -> [0,1)
+ K(x) = Ku(x) / c such that integral K(x) dx = 1
+ Parameterized
+ by bandwidth K[h](x) = K(x/h) / h
+
+ e.g.
+ Ku(x) : exp(-x^2/2)
+ c : sqrt(2 pi)
+ K(x) : exp(-x^2/2) / sqrt(2 pi)
+ K[h](x) : exp(-(x/h)^2 /2) / ( sqrt(2 pi) * h)
+
+Binary:
+ K(x0,x1) = K(|x0-x1|)
+
+Multivariate:
+ Ku[d](x) : Ku(||x||)
+
+ e.g
+ Ku[d](x) : exp(-||x||^2) : Ku(||x||)
+ c[d] : (sqrt(2 pi))^d : c^d
+ K[d](x) : Ku[d](x)/c[d]
+
+ K[d,h](x) : exp(-||x/h||^2)/(c * h)^d
+ = Ku(||x||/h) / (c * h)^d
+
+[Output]
+
+Here's a copy of the output from libs/src/main.cpp:
+
+-> example_scalar_rp : 0.256009 0.390986 0.229151 0.261866 0.232843 0.228334
+0.185519 0.382394 0.388576 0.383203 <-
+-> example_scalar_nw : <-
+-> example_mono_bw_kernel_rp : <-
+-> example_benchmark_scalar : m = 10000
+rp : (n, avg_t,e)nw : (n, avg_t,e_p,e_y)
+rp : (2, 4.0672e-06, 0.173297) nw : (2, 8.3822e-06, 0.173297, 1.2265)
+rp : (4, 2.859e-06, 0.138165) nw : (4, 5.86085e-06, 0.138165, 1.06847)
+rp : (8, 3.01747e-06, 0.104602) nw : (8, 6.289e-06, 0.104602, 0.831601)
+rp : (16, 3.96233e-06, 0.0978568) nw : (16, 8.16075e-06, 0.0978568, 0.702637)
+rp : (32, 5.94066e-06, 0.0818718) nw : (32, 1.20646e-05, 0.0818718, 0.595622)
+rp : (64, 9.40283e-06, 0.0719671) nw : (64, 1.95258e-05, 0.0719671, 0.531848)
+rp : (128, 1.57752e-05, 0.0634377) nw : (128, 3.26977e-05, 0.0634377, 0.477009)
+rp : (256, 2.72618e-05, 0.0565864) nw : (256, 5.64065e-05, 0.0565864, 0.432414)
+rp : (512, 4.81542e-05, 0.0511678) nw : (512, 0.00010056, 0.0511678, 0.398781)
+rp : (1024, 8.67954e-05, 0.0469132) nw : (1024, 0.000179972, 0.0469132, 0.371483)
+<-
+
+

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/benchmark_scalar.cpp
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/benchmark_scalar.cpp 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,273 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::benchmark_scalar.cpp //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+/*
+
+#include <cmath>
+#include <vector>
+#include <algorithm>
+#include <iterator>
+#include <boost/function.hpp>
+#include <boost/range.hpp>
+#include <boost/foreach.hpp>
+#include <boost/random/mersenne_twister.hpp>
+#include <boost/random/normal_distribution.hpp>
+#include <boost/random/variate_generator.hpp>
+#include <boost/math/special_functions/fpclassify.hpp> //needed?
+#include <boost/math/tools/precision.hpp>
+
+// Order of the files matters!
+#include <boost/statistics/detail/distribution_toolkit/distributions/normal.hpp>
+#include <boost/statistics/detail/distribution_toolkit/fwd_math/cdf.hpp>
+
+
+#include <boost/binary_op/data/tuple_range.hpp>
+
+#include <boost/statistics/detail/kernel/include.hpp>
+
+//#include <boost/statistics/detail/kernel/bandwidth/normal_distribution.hpp>
+//#include <boost/statistics/detail/kernel/kernels/scalar/gaussian.hpp>
+//#include <boost/statistics/detail/kernel/estimation/nw_visitor_tuple.hpp>
+//#include <boost/statistics/detail/kernel/estimation/estimator.hpp>
+
+#include <boost/statistics/goodness_of_fit/include.hpp>
+
+#include <libs/statistics/detail/kernel/example/benchmark_scalar.h>
+
+void example_benchmark_scalar(std::ostream& out){
+ out << "-> example_benchmark_scalar : ";
+ out.flush();
+
+ using namespace boost;
+ namespace stat = statistics;
+ namespace gof = stat::goodness_of_fit;
+
+ // Types
+ typedef double val_;
+ typedef function<val_(const val_&)> y_fun_;
+ typedef std::vector<val_> vals_;
+ typedef std::vector<unsigned> uints_;
+ typedef mt19937 urng_;
+ typedef normal_distribution<val_> rnorm_;
+ typedef variate_generator<urng_&,rnorm_> gen_;
+ typedef statistics::detail::kernel::scalar::gaussian_kernel<val_> gauss_k_;
+ typedef statistics::detail::kernel::rp_visitor<gauss_k_,val_> rp_visitor_;
+ typedef math::normal_distribution<val_> mnorm_;
+
+ // Constants
+ const unsigned max_j = 5; // # partitions, k = 2^j
+ const unsigned dim = 1;
+ typedef statistics::detail::kernel::normal_distribution<dim,val_> opt_;
+
+ const unsigned total_n = 1e4;
+ const val_ mu = 0.0;
+ const val_ sigma = 2.0;
+
+ // Initialization
+ y_fun_ y_fun = fabs;
+
+ vals_ vec_x( total_n );
+ vals_ vec_y( total_n );
+ vals_ vec_pdf( total_n );
+ urng_ urng;
+ rnorm_ rnorm(mu,sigma);
+ gen_ gen(urng,rnorm);
+
+ { // Data generation
+ mnorm_ mnorm(mu,sigma);
+ generate_function_n<math::fun_wrap::pdf_>(
+ boost::begin(vec_x),
+ boost::begin(vec_pdf),
+ total_n,
+ mnorm,
+ urng
+ );
+ transform(
+ boost::begin(vec_x),
+ boost::end(vec_x),
+ boost::begin(vec_y),
+ y_fun
+ );
+ }
+ { // K-fold NW-estimation
+ typedef binary_op::tuple_range<const vals_&,const vals_&>
+ meta_range_xy_;
+ typedef meta_range_xy_::type range_xy_;
+ typedef range_value<range_xy_>::type xy_ref_cv_;
+ typedef binary_op::tuple_remove_ref_cv<
+ xy_ref_cv_
+ >::type xy_;
+
+ range_xy_ range_xy = meta_range_xy_::make(vec_x,vec_y);
+ typedef gof::k_fold_data< xy_ > k_fold_;
+ typedef k_fold_::range_train_data_type range_train_;
+
+ typedef statistics::detail::kernel::estimator<
+ range_train_,
+ statistics::detail::kernel::nw_visitor_tuple,
+ gauss_k_
+ > estimator_;
+
+ vals_ mse;
+ vals_ mae;
+ uints_ ns;
+ vals_ bandwidths;
+ vals_ vec_est_y( total_n );
+ unsigned k = 1;
+ for(unsigned j = 1; j<max_j; j++){
+ k *= 2;
+ k_fold_ k_fold(
+ k,
+ boost::begin(range_xy),
+ boost::end(range_xy)
+ );
+ ns.push_back( k_fold.n() );
+ // Warning: This bandwidth is not optimal for nw (but better than
+ // a wild guess).
+ bandwidths.push_back(
+ opt_::rp_bandwidth(sigma, k_fold.n())
+ );
+ estimator_ e(
+ bandwidths.back()
+ );
+ gof::regression_estimate(
+ k_fold,
+ e,
+ boost::begin( vec_est_y )
+ );
+ mse.push_back(
+ gof::sqrt_mse(
+ boost::begin( vec_est_y ),
+ boost::end( vec_est_y ),
+ boost::begin( vec_y )
+ )
+ );
+ mae.push_back(
+ gof::mean_abs_error(
+ boost::begin( vec_est_y ),
+ boost::end( vec_est_y ),
+ boost::begin( vec_y )
+ )
+ );
+ }
+
+ out << std::endl << "NW estimation";
+ out << std::endl <<"bandwidth : ";
+ std::copy(
+ boost::rbegin( bandwidths ),
+ boost::rend( bandwidths ),
+ std::ostream_iterator<val_>(out," ")
+ );
+ out << std::endl <<"n : ";
+ std::copy(
+ boost::rbegin( ns ),
+ boost::rend( ns ),
+ std::ostream_iterator<val_>(out," ")
+ );
+ out << std::endl <<"sqrt_mse : ";
+ std::copy(
+ boost::rbegin( mse ),
+ boost::rend( mse ),
+ std::ostream_iterator<val_>(out," ")
+ );
+ out << std::endl <<"mae : ";
+ std::copy(
+ boost::rbegin( mae ),
+ boost::rend( mae ),
+ std::ostream_iterator<val_>(out," ")
+ );
+
+ }
+ { // K-fold RP-estimation
+
+ typedef gof::k_fold_data< val_ > k_fold_;
+ typedef k_fold_::range_train_data_type range_train_;
+
+ typedef statistics::detail::kernel::estimator<
+ range_train_,
+ statistics::detail::kernel::rp_visitor,
+ gauss_k_
+ > estimator_;
+
+ vals_ mse;
+ vals_ mae;
+ uints_ ns;
+ vals_ bandwidths;
+ vals_ vec_est_pdf( total_n );
+ unsigned k = 1;
+ for(unsigned j = 1; j<max_j; j++){
+ k *= 2;
+ k_fold_ k_fold(
+ k,
+ boost::begin( vec_x ),
+ boost::end( vec_x )
+ );
+ ns.push_back( k_fold.n() );
+ // Note: This bandwidth is not optimal for density estimation
+ // using a gaussian_kernel kernel, under normality of data
+ bandwidths.push_back(
+ opt_::rp_bandwidth(sigma, k_fold.n())
+ );
+ estimator_ e(
+ bandwidths.back()
+ );
+ gof::marginal_estimate(
+ k_fold,
+ e,
+ boost::begin( vec_est_pdf )
+ );
+ mse.push_back(
+ gof::sqrt_mse(
+ boost::begin( vec_est_pdf ),
+ boost::end( vec_est_pdf ),
+ boost::begin( vec_pdf )
+ )
+ );
+ mae.push_back(
+ gof::mean_abs_error(
+ boost::begin( vec_est_pdf ),
+ boost::end( vec_est_pdf ),
+ boost::begin( vec_pdf )
+ )
+ );
+ }
+ out << std::endl << "RP estimation ";
+ out << std::endl <<"bandwidth : ";
+ std::copy(
+ boost::rbegin( bandwidths ),
+ boost::rend( bandwidths ),
+ std::ostream_iterator<val_>(out," ")
+ );
+ out << std::endl <<"n : ";
+ std::copy(
+ boost::rbegin( ns ),
+ boost::rend( ns ),
+ std::ostream_iterator<val_>(out," ")
+ );
+ out << std::endl <<"sqrt_mse : ";
+ std::copy(
+ boost::rbegin( mse ),
+ boost::rend( mse ),
+ std::ostream_iterator<val_>(out," ")
+ );
+ out << std::endl <<"mae : ";
+ std::copy(
+ boost::rbegin( mae ),
+ boost::rend( mae ),
+ std::ostream_iterator<val_>(out," ")
+ );
+
+ }
+ // DO the same for rp
+
+ out << "<-" << std::endl;
+
+}
+*/
+
+
+

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/benchmark_scalar.h
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/benchmark_scalar.h 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,14 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::benchmark_scalar.h //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#ifndef LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_BENCHMARK_SCALAR_H_ER_2009
+#define LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_BENCHMARK_SCALAR_H_ER_2009
+#include <ostream>
+
+void example_benchmark_scalar(std::ostream&);
+
+#endif
\ No newline at end of file

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/mv_mono_bw_rp.cpp
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/mv_mono_bw_rp.cpp 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,100 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::mono_bw_kernel_rp.cpp //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#include <vector>
+#include <algorithm>
+#include <iterator>
+#include <boost/range.hpp>
+#include <boost/foreach.hpp>
+#include <boost/random/mersenne_twister.hpp>
+#include <boost/random/normal_distribution.hpp>
+#include <boost/random/variate_generator.hpp>
+#include <boost/math/special_functions/fpclassify.hpp> //needed?
+#include <boost/math/tools/precision.hpp>
+
+#include <boost/statistics/detail/kernel/kernels/scalar/gaussian.hpp>
+#include <boost/statistics/detail/kernel/kernels/multivariate/mono_bw.hpp>
+#include <boost/statistics/detail/kernel/estimation/rp_visitor.hpp>
+#include <boost/statistics/detail/kernel/estimation/estimator.hpp>
+#include <libs/statistics/detail/kernel/example/scalar_rp.h>
+
+void example_mv_mono_bw_rp(std::ostream& out){
+ out << "-> example_mono_bw_kernel_rp : ";
+
+
+ // This example shows how to compute a Rosenblatt-Parzen estimate of the
+ // density, p(x). The type used for each data-unit, x, is a vector of
+ // doubles, and the kernel uses the same bandwidth throughout all
+ // coordinates
+
+ using namespace boost;
+ namespace kernel = boost::statistics::detail::kernel;
+
+ // Types
+ typedef double val_;
+ typedef std::vector<val_> vec_;
+ typedef vec_ x_;
+ typedef std::vector<x_> dataset_;
+ typedef mt19937 urng_;
+ typedef normal_distribution<val_> norm_;
+ typedef variate_generator<urng_&,norm_> gen_;
+ typedef kernel::scalar::gaussian_kernel<val_> gauss_k_;
+
+ const unsigned dim = 2;
+ typedef kernel::multivariate::mono_bw_kernel<gauss_k_,dim> mono_bw_kernel_k_;
+ // Use of a const reference is not necessary but probably improves speed
+ typedef kernel::rp_visitor<mono_bw_kernel_k_,const x_&> rp_visitor_;
+
+ // Constants
+ const val_ bandwidth = 0.5;
+ const val_ eps = math::tools::epsilon<val_>();
+ const unsigned n = 10;
+
+ // Generate n samples, each drawn from prod{N(0,1):i=1,...,dim}
+ dataset_ dataset; dataset.reserve(n);
+ vec_ vec_rp; vec_rp.reserve(n);
+ urng_ urng;
+ norm_ norm;
+ gen_ gen(urng,norm);
+ for(unsigned i = 0; i<n; i++){
+ vec_ tmp(dim);
+ std::generate_n(
+ boost::begin(tmp),
+ dim,
+ gen
+ );
+ dataset.push_back( tmp );
+ }
+
+ // Density estimate for each x in dataset
+ BOOST_FOREACH(const x_& x,dataset){
+ val_ rp = std::for_each(
+ boost::begin(dataset),
+ boost::end(dataset),
+ rp_visitor_(bandwidth,x)
+ ).estimate();
+ vec_rp.push_back(rp);
+ }
+ typedef sub_range<dataset_> sub_;
+ typedef kernel::estimator<
+ sub_,
+ kernel::rp_visitor,
+ mono_bw_kernel_k_
+ > estimator_;
+ estimator_ estimator(bandwidth);
+ estimator.train(sub_(dataset));
+ vec_ vec_rp2; vec_rp2.reserve(n);
+
+ // Same as previous but calls estimator instead of for_each
+ for(unsigned i = 0; i<n; i++){
+ x_ x = dataset[i];
+ val_ rp = vec_rp[i];
+ val_ rp2 = estimator.predict(x);
+ BOOST_ASSERT(fabs(rp-rp2)<eps);
+ }
+ out << "<-" << std::endl;
+}

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/mv_mono_bw_rp.h
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/mv_mono_bw_rp.h 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,14 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::mono_bw_kernel_rp.h //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#ifndef LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_MULTI_RP_H_ER_2009
+#define LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_MULTI_RP_H_ER_2009
+#include <ostream>
+
+void example_mv_mono_bw_rp(std::ostream&);
+
+#endif
\ No newline at end of file

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_nw.cpp
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_nw.cpp 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,145 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::scalar_nw.cpp //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#include <vector>
+#include <algorithm>
+#include <iterator>
+#include <boost/range.hpp>
+#include <boost/foreach.hpp>
+#include <boost/random/mersenne_twister.hpp>
+#include <boost/random/normal_distribution.hpp>
+#include <boost/random/variate_generator.hpp>
+#include <boost/math/special_functions/fpclassify.hpp> //needed?
+#include <boost/math/tools/precision.hpp>
+#include <boost/typeof/typeof.hpp>
+
+#include <boost/fusion/sequence/intrinsic/at_key.hpp>
+#include <boost/fusion/include/at_key.hpp>
+#include <boost/fusion/container/map.hpp>
+#include <boost/fusion/include/map.hpp>
+#include <boost/fusion/include/map_fwd.hpp>
+
+#include <boost/statistics/detail/fusion/functor/at_key.hpp>
+
+#include <boost/statistics/detail/kernel/kernels/scalar/gaussian.hpp>
+#include <boost/statistics/detail/kernel/estimation/meta_nw_visitor_unary.hpp>
+#include <boost/statistics/detail/kernel/estimation/estimator.hpp>
+#include <libs/statistics/detail/kernel/example/scalar_nw.h>
+
+void example_scalar_nw(std::ostream& out){
+
+ out << "-> example_scalar_nw : ";
+ using namespace boost;
+
+ namespace kernel = boost::statistics::detail::kernel;
+
+ // This example shows how to compute a Nadaraya-Watson estimate of E[y|x].
+ // The type used for each data-unit, here, is a fusion map whose x and y
+ // components are accessed using keys
+
+ // Types
+ typedef double val_;
+ typedef std::vector<val_> vals_;
+ typedef mpl::int_<0> key_x_;
+ typedef mpl::int_<1> key_y_;
+ typedef fusion::pair<key_x_,val_> x_;
+ typedef fusion::pair<key_y_,val_> y_;
+ typedef statistics::detail::fusion::functor::at_key<key_x_> at_key_x_;
+ typedef statistics::detail::fusion::functor::at_key<key_y_> at_key_y_;
+ typedef fusion::map<x_,y_> data_unit_;
+ typedef std::vector<data_unit_> dataset_;
+ // The rationale for data_range_ is it's cheap to copy
+ typedef sub_range<dataset_> data_range_;
+
+ typedef mt19937 urng_;
+ typedef normal_distribution<val_> norm_;
+ typedef variate_generator<urng_&,norm_> gen_;
+ typedef kernel::scalar::gaussian_kernel<val_> gauss_k_;
+ typedef kernel::meta_nw_visitor_unary<
+ at_key_x_,
+ at_key_y_
+ > meta_nw_visitor_u_;
+ typedef meta_nw_visitor_u_::apply<
+ gauss_k_,
+ val_
+ >::type nw_visitor_u_;
+ typedef nw_visitor_u_::nw_visitor_type nw_visitor_;
+ typedef nw_visitor_u_::rp_visitor_type rp_visitor_;
+
+ // Constants
+ const val_ bandwidth = 0.5;
+ const val_ eps = math::tools::epsilon<val_>();
+ const unsigned n = 10;
+
+ // Initialization
+ vals_ vec_rp; vec_rp.reserve(n);
+ vals_ vec_nw; vec_nw.reserve(n);
+ dataset_ dataset;
+ dataset.reserve(n);
+ {
+ urng_ urng;
+ norm_ norm;
+ gen_ gen(urng,norm);
+ val_ one = static_cast<val_>(1);
+ for(unsigned i = 0; i<n; i++){
+ dataset.push_back(
+ data_unit_(
+ fusion::make_pair<key_x_>(gen()),
+ fusion::make_pair<key_y_>(one)
+ )
+ );
+ }
+ }
+
+ // Computes nw = E[y|x] for each x in the dataset. The density (rp) is
+ // obtained as a by-product. Here, y = 1, so we should have
+ // rp = nw (un-normalized).
+ BOOST_FOREACH(data_unit_& u,dataset){
+ nw_visitor_ nw_visitor = std::for_each(
+ boost::begin(dataset),
+ boost::end(dataset),
+ nw_visitor_u_(
+ bandwidth,
+ fusion::at_key<key_x_>(u)
+ )
+ );
+ val_ u_nw = nw_visitor.unnormalized_estimate();
+ vec_nw.push_back(u_nw);
+ rp_visitor_ rp_visitor = nw_visitor.rp_visitor();
+ val_ rp = rp_visitor.estimate();
+ BOOST_ASSERT(fabs(rp-u_nw)<eps);
+ }
+
+ // Same as above using estimator
+
+ typedef kernel::estimator<
+ data_range_,
+ meta_nw_visitor_u_::apply,
+ gauss_k_
+ > estimator_;
+ estimator_ estimator(bandwidth);
+ estimator.train(
+ data_range_(dataset)
+ ); // * step 1 *
+
+
+ BOOST_FOREACH(data_unit_& u,dataset){
+ // -> these steps are independent of step2, they're just a test
+ val_ x = fusion::at_key<key_x_>(u);
+ BOOST_AUTO( nw_v , estimator.visit(x) );
+ val_ u_nw = nw_v.unnormalized_estimate();
+ BOOST_AUTO( rp_v , nw_v.rp_visitor() );
+ val_ rp = rp_v.estimate();
+ BOOST_ASSERT(fabs(rp-u_nw)<eps);
+ // <-
+
+ estimator.predict(x); // * step 2 *
+
+ }
+
+ out << "<-" << std::endl;
+}

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_nw.h
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_nw.h 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,15 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::scalar_nw.h //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#ifndef LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_SCALAR_NW_H_ER_2009
+#define LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_SCALAR_NW_H_ER_2009
+#include <ostream>
+
+void example_scalar_nw(std::ostream&);
+
+#endif
+

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_rp.cpp
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_rp.cpp 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,90 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::scalar_rp.cpp //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#include <vector>
+#include <algorithm>
+#include <iterator>
+#include <boost/range.hpp>
+#include <boost/foreach.hpp>
+#include <boost/random/mersenne_twister.hpp>
+#include <boost/random/normal_distribution.hpp>
+#include <boost/random/variate_generator.hpp>
+#include <boost/math/special_functions/fpclassify.hpp> //needed?
+#include <boost/math/tools/precision.hpp>
+#include <boost/mpl/int.hpp>
+
+#include <boost/statistics/detail/kernel/kernels/scalar/gaussian.hpp>
+#include <boost/statistics/detail/kernel/estimation/rp_visitor.hpp>
+#include <boost/statistics/detail/kernel/estimation/estimator.hpp>
+#include <libs/statistics/detail/kernel/example/scalar_rp.h>
+
+void example_scalar_rp(std::ostream& out){
+ out << "-> example_scalar_rp : ";
+ using namespace boost;
+
+ // This example shows how to compute a Rosenblatt-Parzen estimate of the
+ // density, p(x). The type used for each data-unit, here, is double
+
+ //Types
+ typedef double val_;
+ typedef std::vector<val_> vec_;
+ typedef mt19937 urng_;
+ typedef normal_distribution<val_> norm_;
+ typedef variate_generator<urng_&,norm_> gen_;
+ typedef statistics::detail::kernel::scalar::gaussian_kernel<val_> gauss_k_;
+ typedef statistics::detail::kernel::rp_visitor<gauss_k_,val_> rp_visitor_;
+
+ // Contants
+ const val_ bandwidth = 0.5;
+ const val_ eps = math::tools::epsilon<val_>();
+ const unsigned n = 10;
+
+ // Initialization
+ vec_ dataset(n);
+ vec_ vec_rp; vec_rp.reserve(n);
+ urng_ urng;
+ norm_ norm;
+ gen_ gen(urng,norm);
+ std::generate_n(
+ begin(dataset),
+ n,
+ gen
+ );
+
+ // Computes a density estimate for each x in dataset
+ BOOST_FOREACH(val_& x,dataset){
+ val_ rp = for_each(
+ boost::begin(dataset),
+ boost::end(dataset),
+ rp_visitor_(bandwidth,x)
+ ).estimate();
+ vec_rp.push_back(rp);
+ }
+
+ // Same as previous but calls estimator instead of for_each
+ typedef sub_range<vec_> sub_x_;
+ typedef
+ statistics::detail::kernel::estimator<
+ sub_x_,
+ statistics::detail::kernel::rp_visitor,
+ gauss_k_
+ > estimator_;
+ estimator_ estimator(bandwidth);
+ sub_x_ sub_x(dataset);
+ estimator.train(sub_x);
+ vec_ vec_rp2; vec_rp2.reserve(n);
+
+ for(unsigned i = 0; i<n; i++){
+ val_ x = dataset[i];
+ val_ rp = vec_rp[i];
+ val_ rp2 = estimator.predict(x);
+ BOOST_ASSERT(fabs(rp-rp2)<eps);
+ }
+
+ out << "<-" << std::endl;
+
+}
\ No newline at end of file

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_rp.h
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/example/scalar_rp.h 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,15 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::example::scalar_rp.h //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#ifndef LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_SCALAR_RP_H_ER_2009
+#define LIBS_STATISTICS_DETAIL_KERNEL_EXAMPLE_SCALAR_RP_H_ER_2009
+#include <ostream>
+
+void example_scalar_rp(std::ostream&);
+
+#endif
+

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/src/main.cpp
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/src/main.cpp 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,23 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::src::main.cpp //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#include <iostream>
+#include <libs/statistics/detail/kernel/example/scalar_rp.h>
+#include <libs/statistics/detail/kernel/example/scalar_nw.h>
+#include <libs/statistics/detail/kernel/example/mv_mono_bw_rp.h>
+#include <libs/statistics/detail/kernel/example/benchmark_scalar.h>
+
+
+int main(){
+
+ example_scalar_rp(std::cout);
+ example_scalar_nw(std::cout);
+ example_mv_mono_bw_rp(std::cout);
+ //example_benchmark_scalar(std::cout);
+
+ return 0;
+}
\ No newline at end of file

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/test/scalar_nw.cpp
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/test/scalar_nw.cpp 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,127 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::test::scalar_nw.cpp //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#include <vector>
+#include <algorithm>
+#include <iterator>
+#include <boost/range.hpp>
+#include <boost/foreach.hpp>
+#include <boost/random/mersenne_twister.hpp>
+#include <boost/random/normal_distribution.hpp>
+#include <boost/random/variate_generator.hpp>
+#include <boost/math/special_functions/fpclassify.hpp> //needed?
+#include <boost/math/tools/precision.hpp>
+#include <boost/typeof/typeof.hpp>
+
+#include <boost/lambda/lambda.hpp>
+#include <boost/lambda/bind.hpp>
+#include <boost/lambda/construct.hpp>
+
+#include <boost/iterator/transform_iterator.hpp>
+
+#include <boost/fusion/sequence/intrinsic/at_key.hpp>
+#include <boost/fusion/include/at_key.hpp>
+#include <boost/fusion/container/map.hpp>
+#include <boost/fusion/include/map.hpp>
+#include <boost/fusion/include/map_fwd.hpp>
+
+#include <boost/statistics/detail/fusion/functor/at_key.hpp>
+
+#include <boost/statistics/detail/kernel/kernels/scalar/gaussian.hpp>
+#include <boost/statistics/detail/kernel/bandwidth_selection/meta_k_fold_nw.hpp>
+#include <libs/statistics/detail/kernel/test/scalar_nw.h>
+
+void test_scalar_nw(std::ostream& out){
+
+ out << "-> test_scalar_nw : ";
+ using namespace boost;
+
+ namespace kernel = boost::statistics::detail::kernel;
+
+ // Types
+ typedef double val_;
+ typedef std::vector<val_> vals_;
+ typedef mpl::int_<0> key_x_;
+ typedef mpl::int_<1> key_y_;
+ typedef fusion::pair<key_x_,val_> x_;
+ typedef fusion::pair<key_y_,val_> y_;
+ typedef statistics::detail::fusion::functor::at_key<key_x_> at_key_x_;
+ typedef statistics::detail::fusion::functor::at_key<key_y_> at_key_y_;
+ typedef fusion::map<x_,y_> data_unit_;
+ typedef std::vector<data_unit_> dataset_;
+ // The rationale for data_range_ is it's cheap to copy
+ typedef sub_range<dataset_> data_range_;
+
+ typedef mt19937 urng_;
+ typedef normal_distribution<val_> norm_;
+ typedef variate_generator<urng_&,norm_> gen_;
+ typedef kernel::scalar::gaussian_kernel<val_> gauss_k_;
+ typedef kernel::bandwidth_selection::meta_k_fold_nw<
+ at_key_x_,
+ at_key_y_
+ > meta_k_fold_nw_;
+
+ typedef meta_k_fold_nw_::apply<
+ data_unit_,
+ gauss_k_
+ >::type k_fold_nw_;
+
+
+ // Constants
+ const val_ bandwidth = 0.5;
+ const val_ eps = math::tools::epsilon<val_>();
+ const unsigned n = 10;
+ const unsigned num_fold = 5;
+
+ // Initialization
+ vals_ pred_nw_s(n);
+ vals_ true_nw_s(n);
+ dataset_ dataset;
+ dataset.reserve(n);
+
+ k_fold_nw_ k_fold_nw;
+ {
+ urng_ urng;
+ norm_ norm;
+ gen_ gen(urng,norm);
+ val_ one = static_cast<val_>(1);
+ for(unsigned i = 0; i<n; i++){
+ dataset.push_back(
+ data_unit_(
+ fusion::make_pair<key_x_>(gen()),
+ fusion::make_pair<key_y_>(one)
+ )
+ );
+ }
+ BOOST_AUTO(
+ f,
+ boost::lambda::bind(
+ boost::lambda::constructor<data_unit_>(),
+ boost::lambda::bind(
+ boost::lambda::constructor<x_>(),
+ boost::lambda::bind(
+ gen
+ )
+ ),
+ fusion::make_pair<key_y_>(one)
+ )
+ );
+
+
+ }
+
+
+ //k_fold_nw_ k_fold_nw(
+ // num_fold,
+ // make_transform_iterator
+ //);
+
+
+
+
+ out << "<-" << std::endl;
+}
\ No newline at end of file

Added: sandbox/statistics/kernel/libs/statistics/detail/kernel/test/scalar_nw.h
==============================================================================
--- (empty file)
+++ sandbox/statistics/kernel/libs/statistics/detail/kernel/test/scalar_nw.h 2009-10-04 18:56:14 EDT (Sun, 04 Oct 2009)
@@ -0,0 +1,14 @@
+///////////////////////////////////////////////////////////////////////////////
+// statistics::detail::kernel::test::scalar_nw.h //
+// //
+// Copyright 2009 Erwann Rogard. Distributed under the Boost //
+// Software License, Version 1.0. (See accompanying file //
+// LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) //
+///////////////////////////////////////////////////////////////////////////////
+#ifndef LIBS_STATISTICS_DETAIL_KERNEL_TEST_SCALAR_NW_H_ER_2009
+#define LIBS_STATISTICS_DETAIL_KERNEL_TEST_SCALAR_NW_H_ER_2009
+#include <ostream>
+
+void test_scalar_nw(std::ostream&);
+
+#endif


Boost-Commit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk