Boost logo

Ublas :

Subject: Re: [ublas] I have a problem with expression templates
From: Michael Lehn (michael.lehn_at_[hidden])
Date: 2012-09-26 17:37:26


In my opinion the fact that no temporary matrix gets created implicitly/automatically
is a feature! In numerical applications you want to be in control over memory resources.
Just consider some extreme cases where one chooses matrix dimensions so big that they just
fit into memory. Even in cases not that extreme, consider that the nested matrix-matrix
product is inside a loop. In this case I don't wanna be forced to rely on a things like
memory pools or some other smart memory management. If the creation of temporaries is
acceptable and wanted then creating them explicitly is a transparent solution.



Am 26.09.2012 um 22:13 schrieb Oswin Krause:

> Hi,
> This issue is actually bugging me a lot.
> While i can understand that one wants to prevent unneccessary copying, i don't understand why this is applied to matrix-matrix multiplication. The issue arises quite often and is in fact a FAQ.
> I think, that the better default behavior should be that prod() should always return an intermediate result for matrix-matrix multiplication, thus making this point obsolete. Given a high enough data dimension, the unneccessary copy does not actually matter and since C++11 move semantics helps us in most cases. If the copy really matters, there is always axpy_prod.
> A change in this direction would also help for another issue:
> actually prod() is the wrong choice of function in the first place, because axpy_prod (the one with the worse syntax) is a lot faster given problems of higher dimensionality (like 20 or so). Which is totally counter intuitive. So whil the copy dos not matter most of the time, it also nables us to use the optimized kernels of axpy_prod (and allows to remove a lot of code...)
> So I suggest a change along the following line (temporary_matrix_type is some scary type traits I don't know by heart right now):
> template<class E1, class E2>
> temporary_matrix_type prod(blas::matrix_expression<E1> const& a, blas::matrix_expression<E2> const& b){
> temporary_matrix_type result;
> axpy_prod(a,b,result);
> return result;
> }
> There is a small disadvantage in edge cases like:
> Matrix B(100,100);
> Matrix C(100,100);
> Matrix A(3,3);
> A+=subrange(prod(B,C),0,3,0,3); //most of prods work is in vain and we have a unneccessary intermediate result.
> however, in almost all cases this change should do what most people actually need.
> Greetings,
> Oswin
> On 2012-09-26 21:53, Gunter Winkler wrote:
>> Hello,
>> Am 26.09.2012 19:59, schrieb Oswin Krause:
>>> Hi,
>>> it is a bad idea to nest calls to prod(), so ublas prevents it. Your
>>> workaround with the intermediate result is fine.
>>> Also, try ublas::prod(trans(A), ublas::prod<ublas::matrix<double> >(B,
>>> A))
>> Let me add some additional explanation:
>> The expression prod(A, B) does not mean "compute the product of A and
>> B". It means "remember to multiply A and B on demand". The actual
>> computation is deferred until next assignment.
>> What does that mean for a nested prod()?
>> prod(A, prod(C,D)) means that during computation of the final result
>> each element of B is computed (repeatedly on the fly) from the
>> expression prod(C,D). The "performance" would be a bad surprise. Thus
>> uBLAS explicitly forbids nested product expressions.
>> The solution is to use a temporary (as George said) or to define the
>> result type of prod as a real matrix instead of the default
>> matrix_expression (as Oswin suggested).
>> mfg
>> Gunter
>> _______________________________________________
>> ublas mailing list
>> ublas_at_[hidden]
>> Sent to: Oswin.Krause_at_[hidden]
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> Sent to: michael.lehn_at_[hidden]