Boost logo

Boost :

Subject: Re: [boost] [Boost-users] [compute] Review period starts today December 15, 2014, ends on December 24, 2014
From: Denis Demidov (dennis.demidov_at_[hidden])
Date: 2014-12-16 02:25:05


Hi,

I am the one who submitted the review to the boost incubator, but I think
it should be updated.
Also I think its easier to read the review as the whole piece here than on
the incubator site.

### Design ###

Boost.Compute provides a thin C++ wrapper around OpenCL host API at its core
and builds a set of STL-like algorithms on top of that core. The user
interface
strongly resembles the STL and hence should be familiar to any C++
programmer.

The only minor problem I have with the design is the decision to provide
another C++ wrapper for OpenCL host API instead of using the standard C++
bindings header [1] provided by the Khronos group (the body behind the
OpenCL
standard). This decision makes interaction with the existing OpenCL
libraries
(that use Khronos C++ bindings) somewhat complicated at times. I don't
believe
its possible to change the design at this point though, so I am prepared to
live with it.

### Implementation ###

Having proposed several patches to the library, I can say that I am familiar
with its implementation. The library is well structured, the code is well
designed, well formatted and is easy to read and understand. The library
provides a large number of examples and an extensive set of unit tests.

When the first public announcement of the library was made here on Boost
mailing list, there were several performance problems. In particular the
compute kernels were not cached at the first invocation, and some algorithms
only provided serial implementation (some still do [2]). I know that a lot
of
effort has been put into the implementation since then, and the situation
has
much improved. The performance page [3] of the documentation shows that
Boost.compute is able to outperform Nvidia's Thrust for some algorithms, but
there is still some work to be done.

### Documentation ###

The documentation does a good job at providing both an overview of the
library
and an extensive API reference. Boost.compute uses Boostbook as the
documentation generator, and has a look and feel compatible with the
majority
of Boost libraries.

### Potential usefulness of the library ###

I'd say that a library that allows to easily harvest the performance
provided
by the modern graphic processors and accelerators is extremely useful. For
me,
as an end user, the convenience could even outweigh a loss of a fraction of
performance.

Since the library interface is so close to STL, it is extremely easy to try
and
use. I have successfully compiled the library with the recent versions of
GCC
and Clang. The unit tests run fine on NVIDIA, AMD, and Intel OpenCL
platforms.

Among the current alternatives to the library that provide an STL-like set
of
containers and algorithms the Boost.Compute is the most portable, being
built
on top of standard OpenCL (see [4] for my take on differences between
Boost.Compute and alternatives at stackoverflow.com).

One thing that could potentially make Boost.compute obsolete is the
inclusion
of n3960 [5] into standard. Kyle, what do you think about this?

### Familiarity with problem domain ###

I am the author of VexCL [6] library that has similar functionality to
Boost.Compute, but provides higher level interface. I would say that I am
well
familiar with GPGPU programming, both CUDA and OpenCL. I have provided an
implementation of Boost.Compute backend (algebra and operations) for
Boost.Odeint library [7], and made a couple of Boost.Compute algorithms
available through VexCL interface.

### Conclusion ###

I think an inclusion of a GPGPU library into Boost is long overdue. In my
opinion, Boost.Compute deserves to be accepted. The interface of the
library is
well designed, but it needs some work on the performance of the provided
algorithms. Due to high reputation of Boost collection of libraries, a newly
included library almost automatically becomes a de-facto standard in its
field.
This is why I want a GPGPU library accepted into Boost to show a state of
the
art performance. I still believe the work on performance may be continued
_after_
the library is accepted into Boost.

### References ###

1. [cl.hpp](http://www.khronos.org/registry/cl/api/1.2/cl.hpp) -- OpenCL
1.2 C++
   Bindings Header File, implementing the [C++ Bindings
   Specification](
http://www.khronos.org/registry/cl/specs/opencl-cplusplus-1.2.pdf).
2. [boost/compute/algorithm/sort_by_key.hpp](http://goo.gl/iHfMdN)
3. Boost.compute [performance](
http://kylelutz.github.io/compute/boost_compute/performance.html).
4. Answer to [Differences between VexCL, Thrust, and
   Boost.Compute](http://goo.gl/MRT12G).
5. [TS for C++ Extensions for Parallelism](
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3960.pdf).
6. [VexCL](https://github.com/ddemidov/vexcl) -- a C++ vector expression
   template library for OpenCL/CUDA.
7. [Boost.compute backend for Boost.odeint](http://goo.gl/xZSd10).

Best regards,
Denis Demidov,
Senior researcher at Supercomputer center of the Russian Academy of
Sciences.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk