
Ublas : 
From: Michael Stevens (mail_at_[hidden])
Date: 20050618 15:55:58
I'll follow up to my mail with some more bench1 results
This time compiled with VC7.1 and executed on the same machine as the results
for GCC4.0
The results show are highly revealing. Firstly the everthing is much slower.
The C array results or often several times slower. Not sure why this should
be VC7.1 normally does not optimise that badly.
Secondly there is usually a much large cost for using uBLAS. Maybe this is
what Christopher is seeing?
Code with compiled with Boost build v2 default release options for MSVC which
are.
TP /Ob2 /GR /ML /EHs
Anyone have any idea what is happening with VC 7.1. I might try a few other
optimiser options such as /Oa and /O2
Michael
DOUBLE, 100
bench_1
inner_prod
C array
elapsed: 2.04 s, 279.09 Mflops
c_vector
elapsed: 2.527 s, 225.304 Mflops
vector<unbounded_array>
elapsed: 3.603 s, 158.019 Mflops
vector + vector
C array
elapsed: 2.592 s, 220.758 Mflops
c_vector safe
elapsed: 8.237 s, 69.4676 Mflops
c_vector fast
elapsed: 5.186 s, 110.336 Mflops
vector<unbounded_array> safe
elapsed: 11.774 s, 48.599 Mflops
vector<unbounded_array> fast
elapsed: 7.608 s, 75.2109 Mflops
bench_2
outer_prod
C array
elapsed: 1.988 s, 287.829 Mflops
c_matrix, c_vector safe
elapsed: 8.637 s, 66.2504 Mflops
c_matrix, c_vector fast
elapsed: 5.446 s, 105.069 Mflops
matrix<unbounded_array>, vector<unbounded_array> safe
elapsed: 11.523 s, 49.6576 Mflops
matrix<unbounded_array>, vector<unbounded_array> fast
elapsed: 8.211 s, 69.6876 Mflops
prod (matrix, vector)
C array
elapsed: 2.8 s, 203.337 Mflops
c_matrix, c_vector safe
elapsed: 3.872 s, 147.041 Mflops
c_matrix, c_vector fast
elapsed: 3.96 s, 143.774 Mflops
matrix<unbounded_array>, vector<unbounded_array> safe
elapsed: 5.826 s, 97.7246 Mflops
matrix<unbounded_array>, vector<unbounded_array> fast
elapsed: 5.251 s, 108.426 Mflops
matrix + matrix
C array
elapsed: 2.958 s, 193.443 Mflops
c_matrix safe
elapsed: 13.766 s, 41.5665 Mflops
c_matrix fast
elapsed: 9.312 s, 61.4481 Mflops
matrix<unbounded_array> safe
elapsed: 19.909 s, 28.741 Mflops
matrix<unbounded_array> fast
elapsed: 15.467 s, 36.9952 Mflops
bench_3
prod (matrix, matrix)
C array
elapsed: 3.421 s, 166.426 Mflops
c_matrix safe
elapsed: 4.033 s, 141.171 Mflops
c_matrix fast
elapsed: 3.471 s, 164.029 Mflops
matrix<unbounded_array> safe
elapsed: 8.64 s, 65.8962 Mflops
matrix<unbounded_array> fast
elapsed: 7.606 s, 74.8545 Mflops
> Not sure what to say at this point. I'm trying to see why we are seeing
> divergent results. You are not being helpful.
>
> Normally uBLAS achieves results comparable to explicity written evaluation
> loops on C arrays. You are not seeing this which as I say is rather odd.
>
> Bench1 is a useful standard test where we could compare results. I posted
> the results so we can compare.
>
> Michael
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/ublas
 ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Phone/Fax: +49 561 5218038 Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________