Boost logo

Ublas :

From: christopher diggins (cdiggins_at_[hidden])
Date: 2005-06-13 08:43:44


----- Original Message -----
From: "Gunter Winkler" <guwi17_at_[hidden]>
To: "ublas mailing list" <ublas_at_[hidden]>
Sent: Sunday, June 12, 2005 2:51 PM
Subject: Re: [ublas] Aliasing for += and -=

> Am Sonntag, 12. Juni 2005 18:41 schrieb christopher diggins:
>> I mentioned this before, and I'll say it again: A straightforward
>> implementation of += and -= is not affected by aliasing. This is
>> trivial to fix. I find it unacceptable that the library can not do
>> such a simple task correctly.
>
> Not for all types: For example
>
> r1 = project( x, range(0,4) )
> r2 = project( x, range(2,6) )
>
> r1 += r2 and r2 += r1 have both aliasing and should be evaluated in
> different directions.

So then I stand corrected, with the current design it is not trivial. Does
using aliasing as a default provide enough benefit to justify the
performance hit everywhere else? I find the highly improbable. What I have
checked: matrix copies, element access, addition, subtraction,
multiplication, scalar multipliciation. These are what I consider to be the
most important and common operations for any matrix/vector library.

Here are my benchmark results:

boost ublas performance, rows = 3, cols = 3, iters = 100000
benchmarking element_access(m1) 406 msec elapsed
benchmarking matrix_copy(m1, m2) 31 msec elapsed
benchmarking scalar_arithmetic(m1) 266 msec elapsed
benchmarking matrix_arithmetic(m1, m2) 2407 msec elapsed
benchmarking multiply(m1, m2, m3) 234 msec elapsed

boost ublas performance, rows = 100, cols = 100, iters = 100
benchmarking element_access(m1) 390 msec elapsed
benchmarking matrix_copy(m1, m2) 0 msec elapsed
benchmarking scalar_arithmetic(m1) 266 msec elapsed
benchmarking matrix_arithmetic(m1, m2) 796 msec elapsed
benchmarking multiply(m1, m2, m3) 3828 msec elapsed

dynamic matrix performance, rows = 3, cols = 3, iters = 100000
benchmarking matrix_element_access(m1) 109 msec elapsed
benchmarking matrix_copy(m1, m2) 47 msec elapsed
benchmarking scalar_arithmetic(m1) 109 msec elapsed
benchmarking matrix_arithmetic(m1, m2) 93 msec elapsed
benchmarking matrix_multiply(m1, m2, m3) 63 msec elapsed

dynamic matrix performance, rows = 100, cols = 100, iters = 100
benchmarking matrix_element_access(m1) 94 msec elapsed
benchmarking matrix_copy(m1, m2) 31 msec elapsed
benchmarking scalar_arithmetic(m1) 94 msec elapsed
benchmarking matrix_arithmetic(m1, m2) 78 msec elapsed
benchmarking matrix_multiply(m1, m2, m3) 2047 msec elapsed

k-matrix performance, rows = 3, cols = 3, iters = 100000
benchmarking kmatrix_element_access(m1) 63 msec elapsed
benchmarking matrix_copy(m1, m2) 31 msec elapsed
benchmarking scalar_arithmetic(m1) 63 msec elapsed
benchmarking matrix_arithmetic(m1, m2) 46 msec elapsed
benchmarking kmatrix_multiply(m1, m2, m3) 47 msec elapsed

k-matrix performance, rows = 100, cols = 100, iters = 100
benchmarking matrix_element_access(m1) 78 msec elapsed
benchmarking matrix_copy(m1, m2) 0 msec elapsed
benchmarking scalar_arithmetic(m1) 78 msec elapsed
benchmarking matrix_arithmetic(m1, m2) 47 msec elapsed
benchmarking matrix_multiply(m1, m2, m3) 922 msec elapsed

k-matrix is a matrix implementation where rows and columns are known at
compile-time. ultiplication was done using axpy_prod. Everything else was
done naively, just as we would expect a libary user to:

i.e.:

matrix<int> m;
m *= 2;
m += m;
m[i][j] = 0;

I can heare the chorus of screams, but you didn't using "noalias". In real
code whether something is aliased or not is not trivially known. So it is
not fair to use "noalias" in a straightforward comparion. Furthermore I am
comparing "out-of-the-box" performance for naive users (the most common
kind).

Christopher Diggins
http://www.cdiggins.com