# Ublas :

Subject: Re: [ublas] Efficiency of product of two sparse matrices
From: George Slavov (gpslavov_at_[hidden])
Date: 2012-12-06 16:46:59

Hi JÃ¶rn,

Thanks for your response. I have made some surprising discoveries since I
read your email. I looked up your code and tried it out but got no speed
boost in my case which is (row major) * (col major) = (col major). By the
way, I get a printout ucucuc. Shouldn't there be a case urucuc? I didn't
see such a case handled in your code.

The next thing I tried is to produce a copy of trans(A) and place it in a
col major compressed matrix, so the product is actually (col major) * (col
major) = (col major) and this is lightning fast! We're talking a fraction
of a second whereas sparse_prod is about half a second, significantly
slower. The majority of time is now spent constructing the transpose matrix
but that's still roughly a third of the computation time of the old
product. Now I'm trying to figure out if there is a way to construct the
transpose in col major format faster, but I have no ideas yet.

I'm really baffled why row major * col major is a bad combination. If I
have A*B = C, each entry of C is computed as the inner product of a row of
A and a column of B, right? That would be the order in which elements are
accessed, so I don't understand. I am really curious about this so any
thoughts you can give me would be appreciated.

Best,
George

On Thu, Dec 6, 2012 at 4:47 AM, "Ungermann, JÃ¶rn" <j.ungermann_at_[hidden]
> wrote:

> Hi George,
>
> this is not a good storage layout for the matrix product.
> There are two major costs involved:
> 1) finding the right entries in A^T and A to multiply with one another.
> 2) writing the stuff into the target matrix. If this cannot be done
> consecutively, performance will seriously suffer from copy operations.
>
> My first tip would be to use a coordinate_matrix as result matrix and only
> copy the result to a compressed matrix after the product is done.
> Secondly, copying the transpose matrix into a column_major matrix might
> help, too, but this depends more on the employed algorithm, and I forgot,
> which kernel is used by sparse_prod.
>
> If these tips are not sufficient or you cannot spend the additional memory,
> you might want to search the mailing list for a patch proposal of mine that
> offers a series of different product kernels suited for a wide range of
> matrix types. All of these are at least as efficient as the ublas kernels
> and some improve the performance dramatically. The mail contains a list of
> measurements for products involving various types.
>
> Either way, you might think about just not doing the matrix product. If you
> can avoid it (and one mostly can), this would be the best solution.
>
> If you have further questions, you may contact me.
>
> Cheers,
> JÃ¶rn
>
>
>
> > -----Original Message-----
> > From: ublas-bounces_at_[hidden] [mailto:ublas-
> > bounces_at_[hidden]] On Behalf Of George Slavov
> > Sent: Dienstag, 4. Dezember 2012 22:03
> > To: ublas_at_[hidden]
> > Subject: [ublas] Efficiency of product of two sparse matrices
> >
> > In my project I need to form the product A^T A where A is a
> > compressed_matrix, but it's proving to be a big bottleneck in my code,
> > taking on the order of 10 seconds. The result is also very sparse. The
> > matrix has a size of about 13000x13000 and has about 37000 nonzero
> > entries. The product of sparse matrices is generally not sparse, but my
> > matrix is structured enough that the product has roughly the same
> > sparsity as A.
> >
> > The code I'm using is basically.
> >
> > typedef ublas::compressed_matrix<T, ublas::column_major, 0,
> > ublas::unbounded_array<unsigned int>, ublas::unbounded_array<T> >
> > sparse_matrix;
> >
> > sparse_matrix A(13000, 13000)
> > sparse_matrix result(13000, 13000)
> > // fill A
> > sparse_prod(trans(A), A, result, false)
> >
> >
> > Are there any conditions on the use of sparse_prod which I'm not aware
> > of with regard to matrix orientations? My A is column major so the
> > transpose is row_major. I would have thought that would be the optimal
> > storage to form the product since sums range over rows of A^T and
> > columns of A.
> >
> > Regards,
> > George
> > _______________________________________________
> > ublas mailing list
> > ublas_at_[hidden]

> > Sent to: j.ungermann_at_[hidden]
>
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]

> Sent to: gpslavov_at_[hidden]
>