Boost logo

Boost Users :

From: John C. Femiani (john.femiani_at_[hidden])
Date: 2008-08-07 14:53:18


joel falcou wrote:
> > But with a index you need two variables, the base pointer + index. And
> > unless your compiler + instruction set allows you to addess a memory
> > location using a base pointer + index then you are going to loose
> much in
> > performance. IMHO a pointer is always faster, it let you write the
> > algorithm in a different way avoiding probably the example you gave
> above.
>
> Did you do the actual performances check ?
> On decent compiler for non-exotic architectures the difference is
> really thin for L1 cache compliant number of elements.
>
> I have nothing against pointer anyway but good data structure like
> Numerical Recipes multi-dimensionnal allocation makes things easier
> with a very small overhead.
> ------------------------------------------------------------------------
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users

We are *comparing* implementations on the same machine, with the same
array, with the same simple access patterns and averaging the running
times over 1000 runs. I even toggled which test runs first or second.
IF the L1 cache misses are an issue, that would slow *both* down, so
multi_array must be EVEN WORSE than the test times showed right?

I reduce the size of the array, to reduce the odds of a cache miss (I hope)
That makes multi_array iterators 20 times slower instead of just 10
times slower!

Also both methods are proceeding through the data in a VERY easy to
predict order. If there are cache misses in this test then there will
probably be cache misses in real code, so it is fair to include their
effect in the comparison.

I think the conclusion is that the pointer based approach has a
significant (up to20 times on my system) performance improvement over
base+index approach (at least as implemented by multi_array).

The command line is: g++ test.cpp -O2 -o test.exe

Results in total ticks for 10000 calls each:

size of array: 14400 bytes
Pointer....: 31
Multi_array: 625
Pointer....: 32
Multi_array: 640
Pointer....: 32
Multi_array: 625
Pointer....: 31
Multi_array: 625
Pointer....: 31
Multi_array: 641
Pointer....: 31
Multi_array: 625
Pointer....: 31
Multi_array: 641
Pointer....: 31
<snip>



#define BOOST_DISABLE_ASSERTS

#include "windows.h"
#include <iostream>
#include "boost\multi_array.hpp"

using namespace std;
using namespace boost;

const size_t DimX = 60;
const size_t DimY = 60;
const size_t DimZ = 1;

volatile int iVal;
template<class It>
void test_helper(It begin, It end){
    while (begin != end)
        iVal = *begin++;
}
void testPointer(multi_array<int,3> &V3) {

    int * pInt = V3.data();
    for (size_t Z=0; Z<DimZ; ++Z)
        for (size_t Y=0; Y<DimY; ++Y)
            test_helper(pInt + (Z*DimY+Y)*DimX, pInt + (Z*DimY+Y)*DimX + DimX);
}

void testMultiArray(multi_array<int,3> &V3) {

    for (size_t Z=0; Z<DimZ; ++Z)
        for (size_t Y=0; Y<DimY; ++Y)
            test_helper(V3[Z][Y].begin(), V3[Z][Y].end());
}
                
int main()
{
    multi_array<int,3> V3(extents[DimZ][DimY][DimX]);

    unsigned long time1, time2;
    cout << "size of array: " << DimZ*DimY*DimX*sizeof(int) << " bytes" << endl;
    for (int k = 0; k < 20; ++k){


        time1 = GetTickCount();
        for (int q = 0; q < 10000; ++q)
            testPointer(V3);
        time2 = GetTickCount();
        cout << "Pointer....: " << time2-time1 << endl;
        
        
        time1 = GetTickCount();
        for (int q = 0; q < 10000; ++q)
            testMultiArray(V3);
        time2 = GetTickCount();
        cout << "Multi_array: " << time2-time1 << endl;
        
       
    }


    return 0;
}


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net