Boost logo

Ublas :

Subject: [ublas] Matrix multiplication performance
From: palik imre (imre_palik_at_[hidden])
Date: 2016-01-23 12:53:18

Hi All,
what's next?  I mean what is the development process for ublas?
Now we have a C-like implementation that outperforms both the mainline, and the branch version (axpy_prod).  What will we do with that?
As far as I see we have the following options:
1) Create a C++ template magic implementation out of it.  But for this, at the least we would need compile-time access to the target instruction set.  Any idea how to do that?
2) Create a compiled library implementation out of it, and choose the implementation run-time based on the CPU capabilities.

3) Include some good defaults/defines, and hope the user will use them.
4) Don't include it, and do something completely different.

What do you think?