|
Ublas : |
Subject: Re: [ublas] pointer to matrix structure
From: Marcel Rehberg (Marcel.Rehberg_at_[hidden])
Date: 2011-08-08 08:14:34
On Mon, 08 Aug 2011 11:41:31 +0200, Kraus Philipp
<philipp.kraus_at_[hidden]> wrote:
>
> Am 08.08.2011 um 11:20 schrieb Marcel Rehberg:
>
>> Right, this would copy the data. What I meant was to use
>> ublas::<T,ublas::column_major> for your original matrix, right from the
>> beginning.
>>
>> It does not make a difference from the ublas interface point of view.
>> So it wouldn't require any code changes except in the definition of the
>> matrices*. To make things easier you could use a typedef like
>
> I think there is a difference: I read the matrix data often in the
> row-order, a loop over the rows and a inner loop over the columns, so if
> I use a column-major the loops must / should be changed. If I use a
> column-major matrix and iterate first over the rows and with the inner
> loop over the columns, it can be create a poor performance, so I would
> like to have a row-major matrix.
Ah sorry, I wasn't even aware of the performance penalty* and I don't know
a way to avoid it. My approach would be to choose the ordering based on
what is the most time or memory consuming operation. In my case I use
lapack quite allot and so I went with ublas::column_major just to avoid
the copying. Sorry that I can't help more.
regards
Marcel
* I wrote a little test program:
#include <iostream>
#include "boost/numeric/ublas/matrix.hpp"
#include "boost/numeric/ublas/io.hpp"
#include "boost/progress.hpp"
int main() {
namespace ublas = boost::numeric::ublas;
typedef ublas::matrix<double, ublas::column_major> myMatrixColMaj;
typedef ublas::matrix<double> myMatrixRowMaj;
unsigned n_rows=5000;
unsigned n_cols=5000;
unsigned i,j;
myMatrixColMaj M1(n_rows,n_cols);
myMatrixRowMaj M2(n_rows,n_cols);
// fill both matrices before testing, for more reliable timings.
for (j=0; j<n_cols;j++) {
for (i=0; i<n_rows;i++) {
M1(i,j)=i+j+1;
M2(i,j)=i+j+1;
}
}
{
boost::progress_timer t;
for (i=0; i<n_rows;i++)
for (j=0; j<n_cols;j++)
M1(i,j)=i+j+1;
}
{
boost::progress_timer t;
for (j=0; j<n_cols;j++)
for (i=0; i<n_rows;i++)
M1(i,j)=i+j+1;
}
{
boost::progress_timer t;
for (i=0; i<n_rows;i++)
for (j=0; j<n_cols;j++)
M2(i,j)=i+j+1;
}
{
boost::progress_timer t;
for (j=0; j<n_cols;j++)
for (i=0; i<n_rows;i++)
M2(i,j)=i+j+1;
}
return 0;
}
Output on my box is:
0.30 s
0.08 s
0.08 s
0.30 s
So traversing the wrong order nearly takes four times as long as
traversing the right order. (in all cases M1==M2).
Regards
Marcel