
Ublas : 
From: Michael Stevens (mail_at_[hidden])
Date: 20050923 04:23:27
On Donnerstag 22 September 2005 22:56, Frank Astier wrote:
> Actually, I've now turned to ATLAS, and I want to see if I can get
> better performance there. C or C++, about the same deal for me, I can
> wrap C ATLAS in C++ classes easily, I think.
>
> To reproduce your results, do I need to worry about
> BOOST_UBLAS_CHECK? I think I tried axpy_prod and dismissed it (too
> quickly), as on a par with prod for my specific problem. But I'd like
> to try again with your settings.
You should fix BOOST_UBLAS_CHECK if you use 1.33.0.
I have attached my test code. It uses dense uBLAS vectors which makes more
sense. The results I posted yesterday were for the GENERIC 'axpy_prod'
function in uBLAS. My test code still had some code I used to find the
problem with BOOST_UBLAS_CHECK. This prevented the fast 'axpy_prod'
specialisation for compressed_matrix being used.
Corrected results from gcc4.01 are:
0 0 0.55
0.1 0.03 0.42
0.2 0.05 0.42
0.3 0.07 0.43
0.4 0.09 0.42
0.5 0.11 0.43
0.6 0.15 0.42
0.7 0.19 0.42
0.8 0.25 0.42
0.9 0.32 0.43
1 0.37 0.42
These results are rather nice!! The uBLAS axpy_prod is always faster then the
naive dense implementation using std::vector, even if there are no zeros. The
compressed format actually allows a very efficient axpy_prod to be
implemented.
It is worth noting the actual implementation of axpy_prod uBLAS uses in this
case. In can work directly using the internal structure of the compressed
matrix thus:
template<class V, class T1, class IA1, class TA1, class E2>
BOOST_UBLAS_INLINE
V &
axpy_prod (const compressed_matrix<T1, row_major, 0, IA1, TA1> &e1,
const vector_expression<E2> &e2,
V &v, row_major_tag) {
typedef typename V::size_type size_type;
typedef typename V::value_type value_type;
for (size_type i = 0; i < e1.filled1 () 1; ++ i) {
size_type begin = e1.index1_data () [i];
size_type end = e1.index1_data () [i + 1];
value_type t (v (i));
for (size_type j = begin; j < end; ++ j)
t += e1.value_data () [j] * e2 () (e1.index2_data () [j]);
v (i) = t;
}
return v;
}
This is what make it much more efficient then prod(). There are similar
implemenations for coordinate matices.
Michael
 ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Phone/Fax: +49 561 5218038 Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________