Subject: [boost] [compute] review
From: Gruenke,Matt (mgruenke_at_[hidden])
Date: 2014-12-28 04:54:22
> 1. What is your evaluation of the design?
I agree with other comments made about synchronization. The design should be more explicit about what's asynchronous, and ideally utilize only one mechanism for synchronization. One approach that would have made this clearer is if high-level operations were actually objects that got posted to the command_queue.
Regarding synchronization, I'm a also bit concerned about the performance impact of synchronizing on all copies to host memory. Overuse of synchronization can easily result in performance deterioration. On this point, I think it might be worth limiting host memory usable with algorithms, to containers that perform implicit synchronization to actual use (or destruction) of results. Give users the choice between that or performing explicit copies to raw types.
Also, I agree with Thomas M that it'd be useful for operations to return events. That said, if you told me it doubled the overhead involved in dispatching operations, I'd suggest to make it optional through overloads or via a mechanism in the command_queue itself.
> 2. What is your evaluation of the implementation?
It seems a bit heavy to be header-only.
I'd agree with Andrey Semashev, regarding thread safety, except I think it shouldn't be an option at all. That said, not all operations on all objects need be thread safe. I think the Boost.ASIO documentation provides a good example of how to express different thread safety limitations. Now, I haven't examined the use of thread-local storage, but do you feel there's a strong case to be made for the thread safety it's providing (especially, given that you seem to feel it's non-essential)?
I'm a bit concerned about the use of type names ending in '_', such as float4_. Is this consistent with Boost conventions? I've only seen that used in other Boost libraries to denote class member variables.
I didn't notice any unit tests specifically intended to verify exception-safety. I'd want to know that had been thoroughly vetted, before I'd consider using Boost.Compute in production code.
Finally, based on the docs, it seems there's a bug in boost::compute::command_queue::enqueue_copy_image_to_buffer() and boost::compute::command_queue::enqueue_copy_buffer_to_image(). The region and origin parameters should be arrays, in the 2D and 3D cases. If this discrepancy is intentional, it should be noted. I haven't exhaustively reviewed the reference docs, so there might be other issues of this sort.
> 3. What is your evaluation of the documentation?
It needs more and better-documented tutorials (and I'm including the "Advanced Topics", here).
Some of the reference pages could use more details, as well, especially concerning synchronization and exception usage.
Regarding exceptions, there should be discussion of specifically what is invalidated by which kinds of errors. If any errors are non-recoverable, this should also be called out and perhaps derived from a different or additional baseclass (not a bad use case for multiple inheritance, actually).
I agree with Mathias Gaunard that it would be helpful to discuss computational complexity, device memory requirements, and the various methods used to implement some of the algorithms. You could continue to omit these details on the simple algorithms, though.
> 4. What is your evaluation of the potential usefulness of the library?
This is my biggest misgiving, by far. In the very near future, I expect developers will opt for either SYCL (https://www.khronos.org/opencl/sycl) or Bolt (https://github.com/HSA-Libraries/Bolt). SYCL provides a modern, standard C++11 wrapper around OpenCL, with better concurrency control and support for integrating kernels inline. Bolt provides many of the same higher-level abstractions found in Boost.Compute, but with forthcoming support for HSA.
To have the kind of lasting relevance and broad applicability to which all Boost libraries should aspire, I think Boost.Compute should be architected to support multiple backends. Though OpenCL support is currently ascendant, it's far from universal and is already flagging on some platforms (Nvidia, not the least). And HSA provides a foundation on which alternatives are actively being built. Most importantly, there exist multitudes of multi-core and multiprocessor systems which lack OpenCL support. It would be eminently useful to support these with such backends as thread pool, OpenMP, etc. And backends could be added to support new technologies, as they mature.
> 5. Did you try to use the library? With what compiler? Did you have any problems?
I downloaded and inspected the sources and some examples. I didn't have any questions or concerns that would be addressed by experimentation.
> 6. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
Review of the documentation, some sources, some examples, and a refresher on SYCL & Bolt. I also read all of the reviews and replies sent thus far.
> 7. Are you knowledgeable about the problem domain?
I've been developing production code for parallel and heterogeneous computing platforms for the majority of the last 16 years. I've followed OpenCL since version 1.0 was finalized. I've also submitted bug reports and minor patches for Khronos' official C++ OpenCL interface.
> 8. Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
Not in its current state. The primary issue is that it's too OpenCL-centric. It should support more backends, even if these are just compile-time options. This is crucial for both current and lasting relevance.
Beyond that, better concurrency control is crucial for optimum performance. If done well, it could even help further reduce usage errors.
This e-mail contains privileged and confidential information intended for the use of the addressees named above. If you are not the intended recipient of this e-mail, you are hereby notified that you must not disseminate, copy or take any action in respect of any information contained in it. If you have received this e-mail in error, please notify the sender immediately by e-mail and immediately destroy this e-mail and its attachments.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk