|
Boost Users : |
From: Ovanes Markarian (om_boost_at_[hidden])
Date: 2008-01-28 18:51:47
sorry, I have overseen it somehow... Probably too late already... That's
true, that array argument to a function is a pointer. Sorry.
I have tested the code with gcc under MacOS X, the function with arguments
as 3 doubles was NOT necesserely faster as the boost::array param. It might
be not worse implementing the macro based approach...
Here is my test app compiled with O3 optimization flags. (Below are the
timing on a Dual Core 2.4 processor machine with 4 gb ram)
//============================================================================
// Name : CppTest.cpp
// Author :
// Version :
// Copyright : Your copyright notice
// Description : Hello World in C++, Ansi-style
//============================================================================
#include <iostream>
#include <numeric>
#include <boost/array.hpp>
#include <boost/progress.hpp>
using namespace std;
typedef boost::array<double, 3> array_type;
array_type::value_type sum_arr_copy(array_type a)
{
return (a[0]+=a[1])+=a[2];
}
array_type::value_type sum_carr(array_type const& a)
{
return a[0]+a[1]+a[2];
}
array_type::value_type sum_carr_temp_result(array_type const& a)
{
double result=a[0];
return (result+=a[1])+=a[2];
}
array_type::value_type sum_carr_accumulate(array_type const& a)
{
return std::accumulate(a.begin(), a.end(), 0);
}
array_type::value_type sum_arr_copy_accumulate(array_type const& a)
{
return std::accumulate(a.begin(), a.end(), 0);
}
double sum_doubles_copy(double d1, double d2, double d3)
{
return d1+d2+d3;
}
double sum_doubles_copy_optimized(double d1, double d2, double d3)
{
return (d1+=d2)+=d3;
}
double sum_doubles_copy_temp_result(double d1, double d2, double d3)
{
double result=d1;
return (result+=d2)+=d3;
}
double sum_doubles_ref_temp_result(double const& d1, double const& d2,
double const d3)
{
double result=d1;
return (result+=d2)+=d3;
}
/// sorry for macro code. I was too lazy to copy paste it...
#define ARR_P() a
#define DBL_P() d1,d2,d3
#define DO_TEST(function, param_seq) \
{ \
cout << "--------------------------\n" \
<< #function"\n"; \
double result=0; \
{ \
boost::progress_timer t; \
for(unsigned long i=0; i<times; ++i) \
result+=function(param_seq()); \
} \
cout << "result: " << result << "\n"; \
}
int main() {
const unsigned long times=~0;
array_type a = {1.0, 2.3, 3.33};
double d1 = 1.0, d2 = 2.3, d3 = 3.33;
DO_TEST(sum_arr_copy, ARR_P);
DO_TEST(sum_carr, ARR_P);
DO_TEST(sum_arr_copy_accumulate, ARR_P);
DO_TEST(sum_carr_temp_result, ARR_P);
DO_TEST(sum_carr_accumulate, ARR_P);
DO_TEST(sum_doubles_copy, DBL_P);
DO_TEST(sum_doubles_copy_optimized, DBL_P);
DO_TEST(sum_doubles_copy_temp_result, DBL_P);
DO_TEST(sum_doubles_ref_temp_result, DBL_P);
cout << "tests done" << endl;
return 0;
}
Please take a look, that accumulate produced wrong result.
Resulting timings after running each test 4294967295 times:
--------------------------
sum_arr_copy
65.19 s
result: 2.84756e+10
--------------------------
sum_carr
14.99 s
result: 2.84756e+10
--------------------------
sum_arr_copy_accumulate
59.97 s
result: 2.57698e+10
--------------------------
sum_carr_temp_result
14.97 s
result: 2.84756e+10
--------------------------
sum_carr_accumulate
59.84 s
result: 2.57698e+10
--------------------------
sum_doubles_copy
15.00 s
result: 2.84756e+10
--------------------------
sum_doubles_copy_optimized
14.98 s
result: 2.84756e+10
--------------------------
sum_doubles_copy_temp_result
14.92 s
result: 2.84756e+10
--------------------------
sum_doubles_ref_temp_result
14.93 s
result: 2.84756e+10
tests done
I hope that helps and will not require you to develop some solution, which
might not be worth the effort.
Good Luck!
Ovanes
On Jan 28, 2008 10:39 PM, Steven Watanabe <watanabesj_at_[hidden]> wrote:
> AMDG
>
> Hicham Mouline wrote:
> > After some though, here is more precisely what I'd like to have...
> > I apologize that it is quite different from the initial problem:
> >
> > template<int n> class Tree {
> > static double sum(); //
> > };
> >
> >
> > If the user instantiates tree<2>, he should get:
> >
> > template<> class Tree<2> {
> > static double sum(double d1, double d2);
> > };
> >
> > template<> class Tree<3> {
> > static double sum(double d1, double d2, double d3);
> > };
> > etc etc...
> >
> >
> > so that in user code, for e.g.:
> >
> > double d= Tree<4>::sum(d1, d2, d3, d4);
> >
> > should compile.
> >
> >
> > Is it possible for me to just define the template Tree for the n-case
> > without the 2- and 3- specializations?
> >
>
> Ah. You still need the preprocessor, but you can rearrange the
> definitions slightly.
> (untested)
>
> template<int N>
> struct TreeSumImpl;
>
> #define TREE_SUM_DEF(z, n, data)\
> template<>\
> struct TreeSumImpl<n> {\
> static double sum(BOOST_PP_ENUM_PARAMS_Z(z, n, double arg)) { ...
> }\
> };
>
> BOOST_PP_REPEAT(20, TREE_SUM_DEF, ~)
>
> template<int N>
> struct Tree : TreeSumImpl<N> {
> // other code
> };
>
> In Christ,
> Steven Watanabe
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net