|
Boost : |
Subject: Re: [boost] [compute] Review
From: Kyle Lutz (kyle.r.lutz_at_[hidden])
Date: 2014-12-31 00:57:43
On Tue, Dec 30, 2014 at 8:14 PM, Yiannis Papadopoulos
<ipapadop_at_[hidden]> wrote:
> Hi,
>
> This is my review of Boost.Compute:
>
>
> 1. What is your evaluation of the design?
>
> It seems logical to me. It is effectively a wrapper around OpenCL that
> provides implementations of higher-level algorithms, and allows
> interoperability with OpenCL and OpenGL.
>
> The Boost.Compute name is a bit misleading, as Boost.Compute supports only
> OpenCL-enabled devices.
>
>
>
> 2. What is your evaluation of the implementation?
>
> There is some code duplication (e.g. type traits) and various other bits and
> pieces that can be moved to existing Boost components. I think there should
> be some effort spent towards that.
Could you let me know which type-traits you think are duplicated or
should be moved elsewhere?
> It seems that performance is on par with Thrust. However, there are other
> libraries out there (e.g Bolt) and multiple devices, so there has to be a
> more extensive experimental evaluation to say decidedly that it is a good
> implementation.
There are a large number of performance benchmarks under the "perf"
directory [1] which can be used to measure and evaluate performance of
the library. But you're right that the performance page in the
documentation currently only shows comparisons with the STL and
Thrust, I'll work on adding others to this.
> 3. What is your evaluation of the documentation?
>
> Overall, it is pretty good. Given the complexity of the accelerator
> programming model, a few more elaborate examples in the tutorial would be
> welcome.
Fully agree, I will continue to work on improving the documentation.
> 4. What is your evaluation of the potential usefulness of the library?
>
> This is difficult to answer. A lot of work has been put in this library and
> it seems the way to go. The interfaces are clean, the code looks solid and
> the developer willing.
>
> However, there is limited vendor support, there are not enough benchmarks
> and there are other alternatives that they have both. Given that
> Boost.Compute is targeted to users that know a thing or two about
> performance, I don't know how they can be convinced to consider using
> Boost.Compute against Bolt or Thrust.
>
>
> 5. Did you try to use the library? With what compiler? Did you have any
> problems?
>
> I did using an AMD 7850 on Linux with gcc 4.8. The few examples I tried,
> compiled and ran fine.
>
>
> 6. How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
>
> I went over the documentation, I glanced over the code and ran a few
> examples.
>
>
> 7. Are you knowledgeable about the problem domain?
>
> I'm in the HPC field. I have extensive experience with MPI, OpenMP,
> pthreads, and less with TBB, CUDA and OpenCL.
>
>
> 8. Do you think the library should be accepted as a Boost library?
>
> This will be a maybe. It is a well-written library with a few minor issues
> that can be resolved.
>
> However, why would someone use Boost.Compute against what is out there?
> Average users can resort to Bolt or Thrust. Power users will probably always
> try to hand-tune their OpenCL or CUDA algorithm. How can we test it and
> prove its performance?
Yes, Thrust and Bolt are alternatives. The problem is that each is
incompatible with the other. Thrust works on NVIDIA GPUs while Bolt
only works on AMD GPUs. Choosing one will preclude your code from
working on devices from the other.
On the other hand, code written with Boost.Compute will work on any
device with an OpenCL implementation. This includes NVIDIA GPUs, AMD
GPUs/CPUs, Intel GPUs/CPUs as well as other more exotic architectures
(Xeon Phi, FPGAs, Parallella Epiphany, etc.). Furthermore, unlike
CUDA/Thrust, Boost.Compute requires no special complier or
compiler-extensions in order to execute code on GPUs, it is a pure
library-level solution which is compatible with any standard C++
compiler.
Also, Boost.Compute does allow for users to access the low-level APIs
and execute their own hand-rolled kernels (and even interleave their
custom operations with the high-level algorithms available in
Boost.Compute). I think using Boost.Compute in this way allows for
both rapid development and the ability to fully-optimize kernels for
specific operations where necessary.
Thanks for the review. Let me know if I can explain anything more clearly.
-kyle
[1] https://github.com/kylelutz/compute/tree/master/perf
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk