On Thu, Apr 22, 2010 at 10:26 AM, xiss burg <firstname.lastname@example.org> wrote:
I get the same performance with double precision. I have NDEBUG defined, but I don't know about other optimization flags. I didn't profile with any tool but the bottle neck is exactly at that section of the code where I perform the element wise summation. I'll try to make a simple sample which performs the same kind of operations to see what I get.
My point wasn't really about float vs double, but rather matrix<float> vs compressed_matrix<float> (or the other types you've tried).
The ublas matrix<float>, with proper inlining, should have next to 0 impact (all it does is calculate an offset into the storage).
So if you're seeing bad performance even with matrix<float>, then it's definitely not a problem with matrix element access.