Subject: Re: [boost] [compute] review
From: Kyle Lutz (kyle.r.lutz_at_[hidden])
Date: 2014-12-21 23:59:05
On Sun, Dec 21, 2014 at 1:45 PM, Pavan Yalamanchili <pavan_at_[hidden]> wrote:
> *Design and Implementation*
> The library provides an STL-like library for OpenCL devices. Although there
> are other libraries offering similar behavior, Boost.Compute is the most
> complete and the most generalized.
> The directory structure is well organized and easy to dig through to
> understand the implementation or debug an issue.
> Others have pointed out the existence of the OpenCL C++ wrapper and the
> fact that Compute library does not use it. Here are a few reasons why I
> think the author was *right* in his design decisions.
> - The C++ wrapper begins to look similar to the original OpenCL C interface
> once you start using it in a generalized fashion.
> - cl.hpp is monolithic (single file 12k+ LOC) and is not modular enough to
> only include the parts that you need.
> - Compute and cl.hpp have two different goals. The biggest selling point of
> Compute is the set of algorithms it supports not its C++ wrapper around
> Having said that, I think the Compute library might be better off with an
> additional interface that accepts the OpenCL-C++ wrapped objects.
I completely agree. This has been the main issue raised so far during
the review. I've been working towards simplifying interoperability
between Boost.Compute and the Khronos C++ wrappers and this should
hopefully something ready for testing within the next week or two.
> The documentation is comprehensive and the API is documented well. There
> are a few improvements that can be done.
> For example, the reference is at the bottom of the TOC and is hard to see
> immediately. The API reference in the TOC can also be expanded out a bit
> more to show the general categories of algorithms that are supported.
Will do. And definitely point me towards any other areas of the
documentation that you think need work.
> *Potential Usefulness*
> It is fairly easy to transition from applications using vector algorithms
> in STL to Compute. The wide availability of OpenCL devices makes the
> library useful to a large user base.
> *Domain Knowledge*
> I am a developer of the ArrayFire library. I am the lead engineer of a
> company that specializes in this domain. I consider myself to be fairly
> knowledgeable in this domain.
> *Experience with the library*
> ArrayFire depends on Boost.Compute for a few algorithms in our OpenCL
> backend. We explicitly and implicitly (via ArrayFire) test Boost.Compute on
> a variety of hardware / compilers / operating systems.
> Compilers and Operating systems we use:
> - gcc 4.8, 4.9 on various Linux distributions
> - clang 3.4 on OSX
> - Visual Studio 2013 on Windows.
> We have found some bugs through our usage that have been mostly resolved by
> the author or by patches sent by us.
Thanks for testing Boost.Compute so rigorously! And thanks especially
for reporting bugs and submitting patches!
> The experience overall has been fairly positive. But there are certainly
> *some* rough edges that need resolving.
> An issue that will plague any OpenCL library is the performance portability
> across various devices. It would be nice if the author can talk about how
> he plans to eventually address this issue.
Recently I have be working infrastructure for automatic
parameter-tuning which will allow algorithms to better adapt and
optimize themselves for the underlying hardware (currently via a
manually run tuning script which caches the optimal kernel execution
parameters on disk). This should hopefully be ready for testing soon.
Other ideas include developing more optimized kernels for specific
hardware configurations. For example, the reduce() algorithm will
currently use a warp-synchronous reduction kernel on NVIDIA hardware
which improves performance 5-10% over the generic version. In the
future I plan on improving other algorithms to detect and optimize
themselves better to the underlying hardware (all while keeping the
same user-facing, high-level interface). I'm also planning on
specializing some of the core algorithms to automatically take
advantage of some of the new built-in work-group reduction and scan
functions from OpenCL 2.0.
> Overall, Boost.Compute will be a great addition to Boost. But before it is
> accepted the following issues need to be addressed.
> - The tests need to be a bit more comprehensive.
> - Due to the general nature of OpenCL, there needs to be a list of
> "officially supported" devices.
> - Make sure all the tests are passing on the supported devices.
Thanks! I'll definitely work on improving the test-suite. Also, there
is currently a list of supported platforms here  (though it doesn't
yet have a specific list of devices).
Thanks for the review!