Subject: Re: [ublas] Efficiency of product of two sparse matrices
From: MT Julianto (mtjulianto_at_[hidden])
Date: 2012-12-08 02:05:53
Have you checked out this:
I guess your case was more or less mentioned there.
On 6 December 2012 22:46, George Slavov <gpslavov_at_[hidden]> wrote:
> Hi Jörn,
> Thanks for your response. I have made some surprising discoveries since I
> read your email. I looked up your code and tried it out but got no speed
> boost in my case which is (row major) * (col major) = (col major). By the
> way, I get a printout ucucuc. Shouldn't there be a case urucuc? I didn't
> see such a case handled in your code.
> The next thing I tried is to produce a copy of trans(A) and place it in a
> col major compressed matrix, so the product is actually (col major) * (col
> major) = (col major) and this is lightning fast! We're talking a fraction
> of a second whereas sparse_prod is about half a second, significantly
> slower. The majority of time is now spent constructing the transpose matrix
> but that's still roughly a third of the computation time of the old
> product. Now I'm trying to figure out if there is a way to construct the
> transpose in col major format faster, but I have no ideas yet.
> I'm really baffled why row major * col major is a bad combination. If I
> have A*B = C, each entry of C is computed as the inner product of a row of
> A and a column of B, right? That would be the order in which elements are
> accessed, so I don't understand. I am really curious about this so any
> thoughts you can give me would be appreciated.
> On Thu, Dec 6, 2012 at 4:47 AM, "Ungermann, Jörn" <
> j.ungermann_at_[hidden]> wrote:
>> Hi George,
>> this is not a good storage layout for the matrix product.
>> There are two major costs involved:
>> 1) finding the right entries in A^T and A to multiply with one another.
>> 2) writing the stuff into the target matrix. If this cannot be done
>> consecutively, performance will seriously suffer from copy operations.
>> My first tip would be to use a coordinate_matrix as result matrix and only
>> copy the result to a compressed matrix after the product is done.
>> Secondly, copying the transpose matrix into a column_major matrix might
>> help, too, but this depends more on the employed algorithm, and I forgot,
>> which kernel is used by sparse_prod.
>> If these tips are not sufficient or you cannot spend the additional
>> you might want to search the mailing list for a patch proposal of mine
>> offers a series of different product kernels suited for a wide range of
>> matrix types. All of these are at least as efficient as the ublas kernels
>> and some improve the performance dramatically. The mail contains a list of
>> measurements for products involving various types.
>> Either way, you might think about just not doing the matrix product. If
>> can avoid it (and one mostly can), this would be the best solution.
>> If you have further questions, you may contact me.
>> > -----Original Message-----
>> > From: ublas-bounces_at_[hidden] [mailto:ublas-
>> > bounces_at_[hidden]] On Behalf Of George Slavov
>> > Sent: Dienstag, 4. Dezember 2012 22:03
>> > To: ublas_at_[hidden]
>> > Subject: [ublas] Efficiency of product of two sparse matrices
>> > In my project I need to form the product A^T A where A is a
>> > compressed_matrix, but it's proving to be a big bottleneck in my code,
>> > taking on the order of 10 seconds. The result is also very sparse. The
>> > matrix has a size of about 13000x13000 and has about 37000 nonzero
>> > entries. The product of sparse matrices is generally not sparse, but my
>> > matrix is structured enough that the product has roughly the same
>> > sparsity as A.
>> > The code I'm using is basically.
>> > typedef ublas::compressed_matrix<T, ublas::column_major, 0,
>> > ublas::unbounded_array<unsigned int>, ublas::unbounded_array<T> >
>> > sparse_matrix;
>> > sparse_matrix A(13000, 13000)
>> > sparse_matrix result(13000, 13000)
>> > // fill A
>> > sparse_prod(trans(A), A, result, false)
>> > Are there any conditions on the use of sparse_prod which I'm not aware
>> > of with regard to matrix orientations? My A is column major so the
>> > transpose is row_major. I would have thought that would be the optimal
>> > storage to form the product since sums range over rows of A^T and
>> > columns of A.
>> > Regards,
>> > George