
Ublas : 
Subject: [ublas] Announcement of ViennaCL (linear algebra on GPUs)
From: Karl Rupp (rupp_at_[hidden])
Date: 20100607 14:55:08
Dear ublas users,
I am proud to announce the Vienna Computing Library (ViennaCL) located
at http://viennacl.sourceforge.net/
* Why is this useful to ublas users?
First of all, ViennaCL is not intended to be a replacement of ublas  it
should be seen as an extension. ViennaCL provides similar basic types
(vector, matrix, compressed_matrix, coordinate_matrix), but they all
reside on GPUs. Thus, the convenience of ublas is now  at least to some
extent  also available on GPUs.
* What about ublas users without suitable GPU?
There is no need for a GPU for ViennaCL in order to be useful to ublas
users. ViennaCL provides generic implementations of three popular
iterative solvers:
 Conjugate Gradient (CG)
 Stabilized Biconjugate Gradient (BiCGStab)
 Generalized Minimum Residual (GMRES)
For all three solvers, an incomplete LU preconditioner with threshold
(ILUT) is available. The solvers integrate directly into the existing
interface, as the following code snippet shows:
 Snippet start 
ublas::compressed_matrix<double> my_matrix(10000,10000);
ublas::vector<double> my_rhs(10000);
/* fill my_matrix and my_rhs here */
// call conjugate gradient solver with default settings:
// (no preconditioner)
ublas::vector<double> my_result =
viennacl::linalg::solve(my_matrix,
my_rhs,
viennacl::linalg::cg_tag());
//compute ILUT preconditioner (with default settings):
viennacl::linalg::ilut_precond< ublas::compressed_matrix<double> >
my_ilut(my_matrix, viennacl::linalg::ilut_tag());
// call conjugate gradient solver with default settings:
// (ilut preconditioner)
my_result = viennacl::linalg::solve(my_matrix,
my_rhs,
viennacl::linalg::cg_tag(),
my_ilut);
 Snippet end 
If all occurences of 'ublas::' in the above snippet are replaced by
'viennacl::', then the solver is run on the GPU, otherwise on the CPU.
This is very handy when switching from GPUenabled hosts to hosts
without suitable GPU. However, the usual program flow is likely to be
the following:
 set up the data on the CPU
 copy the data to the GPU and do some fast arithmetics there
 copy the result back to the CPU and proceed as usual
Pushing and pulling data between CPU and GPU is provided by dedicated
copy() functions, so e.g. pushing a ublas::compressed_matrix<T>
ublas_matrix to a viennacl::compressed_matrix<T> vcl_matrix is done via
copy(ublas_matrix, vcl_matrix); //CPU to GPU
copy(vcl_matrix, ublas_matrix); //GPU to CPU
I would really appreciate if the ublas community can give me some
feedback on ViennaCL, either here in the ublas mailinglist, directly per
email or any other means listed on http://viennacl.sourceforge.net/
Best regards,
Karli
PS: I apologize if you consider this email as spam  it is not intended
to be.