Boost logo

Ublas :

Subject: [ublas] Announcement of ViennaCL (linear algebra on GPUs)
From: Karl Rupp (rupp_at_[hidden])
Date: 2010-06-07 14:55:08

Dear ublas users,

I am proud to announce the Vienna Computing Library (ViennaCL) located

* Why is this useful to ublas users?
First of all, ViennaCL is not intended to be a replacement of ublas - it
should be seen as an extension. ViennaCL provides similar basic types
(vector, matrix, compressed_matrix, coordinate_matrix), but they all
reside on GPUs. Thus, the convenience of ublas is now - at least to some
extent - also available on GPUs.

* What about ublas users without suitable GPU?
There is no need for a GPU for ViennaCL in order to be useful to ublas
users. ViennaCL provides generic implementations of three popular
iterative solvers:
- Conjugate Gradient (CG)
- Stabilized Bi-conjugate Gradient (BiCGStab)
- Generalized Minimum Residual (GMRES)
For all three solvers, an incomplete LU preconditioner with threshold
(ILUT) is available. The solvers integrate directly into the existing
interface, as the following code snippet shows:

----- Snippet start -----

ublas::compressed_matrix<double> my_matrix(10000,10000);
ublas::vector<double> my_rhs(10000);

/* fill my_matrix and my_rhs here */

// call conjugate gradient solver with default settings:
// (no preconditioner)
ublas::vector<double> my_result =

//compute ILUT preconditioner (with default settings):
viennacl::linalg::ilut_precond< ublas::compressed_matrix<double> >
           my_ilut(my_matrix, viennacl::linalg::ilut_tag());

// call conjugate gradient solver with default settings:
// (ilut preconditioner)
my_result = viennacl::linalg::solve(my_matrix,

--- Snippet end ---

If all occurences of 'ublas::' in the above snippet are replaced by
'viennacl::', then the solver is run on the GPU, otherwise on the CPU.
This is very handy when switching from GPU-enabled hosts to hosts
without suitable GPU. However, the usual program flow is likely to be
the following:
- set up the data on the CPU
- copy the data to the GPU and do some fast arithmetics there
- copy the result back to the CPU and proceed as usual
Pushing and pulling data between CPU and GPU is provided by dedicated
copy() functions, so e.g. pushing a ublas::compressed_matrix<T>
ublas_matrix to a viennacl::compressed_matrix<T> vcl_matrix is done via
copy(ublas_matrix, vcl_matrix); //CPU to GPU
copy(vcl_matrix, ublas_matrix); //GPU to CPU

I would really appreciate if the ublas community can give me some
feedback on ViennaCL, either here in the ublas mailinglist, directly per
email or any other means listed on

Best regards,

PS: I apologize if you consider this email as spam - it is not intended
to be.