Hi—
So far my code below has been unable to beat the Fortran 77 program benchmark using a quad core i7. I suspect the boost experts on this list can offer some advice on how to best utilize the library for this specific scenario (data described below) with a quick review of the code extracted below.
While debugging this, I get unexpected results from M.filled1(), M.filled2() when compared to M.size1() and M.size2(). I expected that filled/size * 100 = % filled but filled2 > size2? v.filled and v.size are what I expected.
When I switch compressed_vector to vector, the code crashes. I had to remove the debug print for v.filled since vector does not implement it. I assume here filled = size.
When I switch to column major form (code commented out below) it takes longer. I suspect that is no surprise.
I’m sure it is time to go from basic boost to advanced. Thanks! We’re using version 1.54.
Extra credit: I’m sure there is an optimized way to do the += and the copy too?
I tried noalias(v2) = prod(M,v1) but it crashes with the example below.
I tried writing my own product function using iterators but it crashes too.
Boost FOREACH looks interesting but I can’t find an example for a Matrix and why rewrite prod if we don’t have to
I think Plan A get boost prod working faster versus Plan B replace prod with rewritten product if the best if you have any suggestions on what I could do better. Thanks!
Visual Studio C++ 2012 compiler settings for Win32 build on Windows 7 64-bit:
/Yu"StdAfx.h" /MP /GS /GL /analyze- /W3 /Gy- /Zc:wchar_t /Zi /Gm /O2 /Ob2 /Fd"Release\vc100.pdb" /fp:fast /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "_USRDLL" /D "TEST_EXPORTS" /D "_WINDLL" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd /Oy- /Oi /MT /Fa"Release\" /EHsc /nologo /Fo"Release\" /Ot /Fp"Release\Test.pch"
#include "stdafx.h" // copied all boost includes into this pre-compiled header file
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/matrix_sparse.hpp>
#include <boost/numeric/ublas/matrix_expression.hpp>
#include <boost/numeric/ublas/matrix_proxy.hpp>
#include <boost/numeric/ublas/vector.hpp>
#include <boost/numeric/ublas/vector_sparse.hpp>
#include <boost/numeric/ublas/vector_expression.hpp>
#include <boost/numeric/ublas/vector_proxy.hpp>
#include <boost/numeric/ublas/lu.hpp>
#include <boost/numeric/ublas/io.hpp>
// make release build faster
#ifdef NDEBUG
#define BOOST_UBLAS_NDEBUG
#endif
struct VECTOR
{
std::vector<boost::numeric::ublas::compressed_vector<float>> vectors;
};
std::vector<boost::numeric::ublas::compressed_matrix<float>> theMatrix;
//std::vector<boost::numeric::ublas::compressed_matrix<float, boost::numeric::ublas::column_major>> theMatrix;
std::vector<VECTOR> v1;
std::vector<VECTOR> v2;
// ...
ilen = 22;
jlen = 2;
xlen = 250;
ylen = 250;
n = xlen * ylen; // size = 64,000
theMatrix.reserve(ilen);
v1.reserve(ilen);
v2.reserve(ilen);
// ...
v2[i].vectors.reserve(jlen);
v2[i].vectors.reserve(jlen);
// ...
theMatrix[i].resize(n,n); // sparse ~7 diagonals filled
v1[i].vectors.resize(n); // starts sparse, ends filled
v2[i].vectors.resize(n);
// ...
theMatrix[i](j,k) = x; // load matrix once
// ...
v2[i].vectors[j] = prod(theMatrix[i], v1[i].vectors[j]);
// ...
v2[i].vectors[j](k) += datapoint; // add more to vector each time
// ...
// copy values back to save for next time step...
for (int i = 0; i < ilen; i++)
{
for (int j = 0; j < jlen; j++)
{
v1[i].vectors[j] = v2[i].vectors[j];
}
}