|
Boost : |
From: David Abrahams (david.abrahams_at_[hidden])
Date: 2002-07-02 03:17:40
From: "Joerg Walter" <jhr.walter_at_[hidden]>
> I think, I've understood, that you propose to evaluate A * (B * v) via
>
> r.clear ();
> for (unsigned j = 0; j < r.size (); ++ j)
> r.plus_assign (column (A, j) * prod (B, v) (j));
Assuming prod(B, v) does not actually compute all of B*v every time you
invoke it, yes. Well, that's part of it...
Actually I proposed that you measure the difference between that and an
approach which creates the temporary vector which results from prod(B, v)
first:
vector c = prod (B, v);
for (unsigned j = 0; j < r.size (); ++ j)
r[j] = dot(row (A, j), c);
and pick whichever gives the best results, perhaps tuning which one to use
based on r.size(). I do believe there are situations where creating the
temporary will result in better performance due to the minimization of
memory writes.
> May be we even could defer the loop evaluation via some lambda construct.
I don't know what that means.
> I've two problems with this approach: I'm not sure if the results of the
> different evaluation path couldn't surprise a client of ublas.
??? I /really/ don't know what that means! Clients shouldn't be concerned
about the internals of the library, as long as it gives correct results.
Note that both approaches always give the same result on all
implementations I know of.
> And I
> currently don't know, how this optimization generalizes to matrix chain
> multiplication and other expressions.
Either approach is, AFAICT, generalizable to all expressions of the form
matrix-expression * vector-expression
-Dave
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk