Boost logo

Boost Users :

From: Ovanes Markarian (om_boost_at_[hidden])
Date: 2008-01-28 18:51:47


sorry, I have overseen it somehow... Probably too late already... That's
true, that array argument to a function is a pointer. Sorry.

I have tested the code with gcc under MacOS X, the function with arguments
as 3 doubles was NOT necesserely faster as the boost::array param. It might
be not worse implementing the macro based approach...

Here is my test app compiled with O3 optimization flags. (Below are the
timing on a Dual Core 2.4 processor machine with 4 gb ram)

//============================================================================
// Name : CppTest.cpp
// Author :
// Version :
// Copyright : Your copyright notice
// Description : Hello World in C++, Ansi-style
//============================================================================

#include <iostream>

#include <numeric>

#include <boost/array.hpp>
#include <boost/progress.hpp>

using namespace std;

typedef boost::array<double, 3> array_type;

array_type::value_type sum_arr_copy(array_type a)
{
    return (a[0]+=a[1])+=a[2];
}

array_type::value_type sum_carr(array_type const& a)
{
    return a[0]+a[1]+a[2];
}

array_type::value_type sum_carr_temp_result(array_type const& a)
{
    double result=a[0];
    return (result+=a[1])+=a[2];
}

array_type::value_type sum_carr_accumulate(array_type const& a)
{
    return std::accumulate(a.begin(), a.end(), 0);
}

array_type::value_type sum_arr_copy_accumulate(array_type const& a)
{
    return std::accumulate(a.begin(), a.end(), 0);
}

double sum_doubles_copy(double d1, double d2, double d3)
{
    return d1+d2+d3;
}

double sum_doubles_copy_optimized(double d1, double d2, double d3)
{
    return (d1+=d2)+=d3;
}

double sum_doubles_copy_temp_result(double d1, double d2, double d3)
{
    double result=d1;
    return (result+=d2)+=d3;
}

double sum_doubles_ref_temp_result(double const& d1, double const& d2,
double const d3)
{
    double result=d1;
    return (result+=d2)+=d3;
}

/// sorry for macro code. I was too lazy to copy paste it...
#define ARR_P() a
#define DBL_P() d1,d2,d3

#define DO_TEST(function, param_seq) \
{ \
    cout << "--------------------------\n" \
         << #function"\n"; \
    double result=0; \
    { \
        boost::progress_timer t; \
        for(unsigned long i=0; i<times; ++i) \
            result+=function(param_seq()); \
    } \
    cout << "result: " << result << "\n"; \
}

int main() {

    const unsigned long times=~0;
    array_type a = {1.0, 2.3, 3.33};
    double d1 = 1.0, d2 = 2.3, d3 = 3.33;

    DO_TEST(sum_arr_copy, ARR_P);
    DO_TEST(sum_carr, ARR_P);
    DO_TEST(sum_arr_copy_accumulate, ARR_P);
    DO_TEST(sum_carr_temp_result, ARR_P);
    DO_TEST(sum_carr_accumulate, ARR_P);
    DO_TEST(sum_doubles_copy, DBL_P);
    DO_TEST(sum_doubles_copy_optimized, DBL_P);
    DO_TEST(sum_doubles_copy_temp_result, DBL_P);
    DO_TEST(sum_doubles_ref_temp_result, DBL_P);

    cout << "tests done" << endl;

    return 0;
}

Please take a look, that accumulate produced wrong result.

Resulting timings after running each test 4294967295 times:

--------------------------
sum_arr_copy
65.19 s

result: 2.84756e+10
--------------------------
sum_carr
14.99 s

result: 2.84756e+10
--------------------------
sum_arr_copy_accumulate
59.97 s

result: 2.57698e+10
--------------------------
sum_carr_temp_result
14.97 s

result: 2.84756e+10
--------------------------
sum_carr_accumulate
59.84 s

result: 2.57698e+10
--------------------------
sum_doubles_copy
15.00 s

result: 2.84756e+10
--------------------------
sum_doubles_copy_optimized
14.98 s

result: 2.84756e+10
--------------------------
sum_doubles_copy_temp_result
14.92 s

result: 2.84756e+10
--------------------------
sum_doubles_ref_temp_result
14.93 s

result: 2.84756e+10
tests done

I hope that helps and will not require you to develop some solution, which
might not be worth the effort.

Good Luck!
Ovanes

On Jan 28, 2008 10:39 PM, Steven Watanabe <watanabesj_at_[hidden]> wrote:

> AMDG
>
> Hicham Mouline wrote:
> > After some though, here is more precisely what I'd like to have...
> > I apologize that it is quite different from the initial problem:
> >
> > template<int n> class Tree {
> > static double sum(); //
> > };
> >
> >
> > If the user instantiates tree<2>, he should get:
> >
> > template<> class Tree<2> {
> > static double sum(double d1, double d2);
> > };
> >
> > template<> class Tree<3> {
> > static double sum(double d1, double d2, double d3);
> > };
> > etc etc...
> >
> >
> > so that in user code, for e.g.:
> >
> > double d= Tree<4>::sum(d1, d2, d3, d4);
> >
> > should compile.
> >
> >
> > Is it possible for me to just define the template Tree for the n-case
> > without the 2- and 3- specializations?
> >
>
> Ah. You still need the preprocessor, but you can rearrange the
> definitions slightly.
> (untested)
>
> template<int N>
> struct TreeSumImpl;
>
> #define TREE_SUM_DEF(z, n, data)\
> template<>\
> struct TreeSumImpl<n> {\
> static double sum(BOOST_PP_ENUM_PARAMS_Z(z, n, double arg)) { ...
> }\
> };
>
> BOOST_PP_REPEAT(20, TREE_SUM_DEF, ~)
>
> template<int N>
> struct Tree : TreeSumImpl<N> {
> // other code
> };
>
> In Christ,
> Steven Watanabe
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net