Subject: Re: [boost] [Boost-users] [compute] Review period starts today December 15, 2014, ends on December 24, 2014
From: Kyle Lutz (kyle.r.lutz_at_[hidden])
Date: 2014-12-17 00:11:55
On Mon, Dec 15, 2014 at 11:25 PM, Denis Demidov
> I am the one who submitted the review to the boost incubator, but I think it
> should be updated.
> Also I think its easier to read the review as the whole piece here than on
> the incubator site.
Thanks for the review (and for being the first to review on the Boost
I've addressed your comments in-line below. Let me know if I missed
anything or can explain anything better.
> ### Design ###
> Boost.Compute provides a thin C++ wrapper around OpenCL host API at its core
> and builds a set of STL-like algorithms on top of that core. The user
> strongly resembles the STL and hence should be familiar to any C++
> The only minor problem I have with the design is the decision to provide
> another C++ wrapper for OpenCL host API instead of using the standard C++
> bindings header  provided by the Khronos group (the body behind the
> standard). This decision makes interaction with the existing OpenCL
> (that use Khronos C++ bindings) somewhat complicated at times. I don't
> its possible to change the design at this point though, so I am prepared to
> live with it.
Yeah, way back when I began work on Boost.Compute (years ago), I
encountered some issues with the C++ OpenCL API which lead me to use
the C API directly. Also, implementing them in Boost.Compute gave me a
bit more control and allowed for usage of Boost-specific tools in the
implementation (e.g. using "BOOST_THROW_EXCEPTION()" for errors
rather than plain "throw").
Furthermore, and more stylistically, I've implemented the
Boost.Compute OpenCL wrapper types with a more STL/Boost-like API
(e.g. "command_queue::enqueue_copy_buffer()" instead of
"CommandQueue::enqueueCopyBuffer()"). I think this gives the library a
more consistent look and feel (as it is heavily inspired by the STL
And, I've also been working on adding first-class support for using
types from the Khronos C++ wrapper library (like cl::Buffer) directly
with Boost.Compute types (like boost::compute::buffer). I think this
should ease any pain when working with both libraries. Hopefully I'll
get this finished soon.
> ### Implementation ###
> Having proposed several patches to the library, I can say that I am familiar
> with its implementation. The library is well structured, the code is well
> designed, well formatted and is easy to read and understand. The library
> provides a large number of examples and an extensive set of unit tests.
> When the first public announcement of the library was made here on Boost
> mailing list, there were several performance problems. In particular the
> compute kernels were not cached at the first invocation, and some algorithms
> only provided serial implementation (some still do ). I know that a lot
> effort has been put into the implementation since then, and the situation
> much improved. The performance page  of the documentation shows that
> Boost.compute is able to outperform Nvidia's Thrust for some algorithms, but
> there is still some work to be done.
Very true, there is still work to be done on this front. Now that the
API is mostly settled, most of the development will be towards
> ### Documentation ###
> The documentation does a good job at providing both an overview of the
> and an extensive API reference. Boost.compute uses Boostbook as the
> documentation generator, and has a look and feel compatible with the
> of Boost libraries.
> ### Potential usefulness of the library ###
> I'd say that a library that allows to easily harvest the performance
> by the modern graphic processors and accelerators is extremely useful. For
> as an end user, the convenience could even outweigh a loss of a fraction of
> Since the library interface is so close to STL, it is extremely easy to try
> use. I have successfully compiled the library with the recent versions of
> and Clang. The unit tests run fine on NVIDIA, AMD, and Intel OpenCL
> Among the current alternatives to the library that provide an STL-like set
> containers and algorithms the Boost.Compute is the most portable, being
> on top of standard OpenCL (see  for my take on differences between
> Boost.Compute and alternatives at stackoverflow.com).
> One thing that could potentially make Boost.compute obsolete is the
> of n3960  into standard. Kyle, what do you think about this?
I've been following the Parallelism TS closely and I'm very happy to
see this being worked on in the standard.
But I don't think this API in the standard library would make
Boost.Compute obsolete. In fact, I can see Boost.Compute being one
possible back-end (or "Executor") for the parallel algorithm API. I
also think Boost.Compute is a little more flexible when it comes to
programming accelerators. For instance, it allows users to directly
execute custom kernels/functions rather than being restricted to just
the algorithms provided by the parallel API. Furthermore,
Boost.Compute allows access to other GPU-specific resources such as
the image/texture-caches and support for direct OpenGL/D3D
interoperation. Being a separate, non-standardized library also allows
it to both evolve more rapidly and support features not currently made
available in the standard.
> ### Familiarity with problem domain ###
> I am the author of VexCL  library that has similar functionality to
> Boost.Compute, but provides higher level interface. I would say that I am
> familiar with GPGPU programming, both CUDA and OpenCL. I have provided an
> implementation of Boost.Compute backend (algebra and operations) for
> Boost.Odeint library , and made a couple of Boost.Compute algorithms
> available through VexCL interface.
> ### Conclusion ###
> I think an inclusion of a GPGPU library into Boost is long overdue. In my
> opinion, Boost.Compute deserves to be accepted. The interface of the library
> well designed, but it needs some work on the performance of the provided
> algorithms. Due to high reputation of Boost collection of libraries, a newly
> included library almost automatically becomes a de-facto standard in its
> This is why I want a GPGPU library accepted into Boost to show a state of
> art performance. I still believe the work on performance may be continued
> the library is accepted into Boost.