Subject: Re: [boost] [compute] kernels as strings impairs readability and maintainability
From: Kyle Lutz (kyle.r.lutz_at_[hidden])
Date: 2014-12-23 14:21:15
On Tue, Dec 23, 2014 at 9:02 AM, Mathias Gaunard
> While reading through the code of Boost.Compute to see what it does and how
> it does it, I often found that the approach used by the library of putting
> all OpenCL kernels inside of strings was an annoying limitation and made it
> quite difficult to reason with them, much less debug or maintain them.
> This has a negative effect on the effort needed to contribute to the
While yes, it does make developing Boost.Compute itself a bit more
complex, it also gives us much greater flexibility.
For instance, we can dynamically build programs at run-time by
combining algorithmic skeletons (such as reduce or scan) with custom
user-defined reduction functions and produce optimized kernels for the
actual platform that executes the code (which in fact can be
dramatically different hardware than where Boost.Compute itself was
compiled). It also allows us to automatically tune algorithm
parameters for the actual hardware present at run-time (and also
allows us to execute currently algorithms as efficiently as possible
on future hardware platforms by re-tuning and scaling up parameters,
all without any recompilation). It also allows us to generate fully
specialized kernels at run-time based on
dynamic-input/user-configuration (imagine user-created filter
pipelines in Photoshop or custom database queries in PGSQL).
I think this added complexity is well worth the cost and this fits
naturally with OpenCL's JIT-like programming model.
> Has separate compilation been considered?
> Put the OpenCL code into .cl files, and let the build system do whatever is
> needed to transform them into a form that can be executed.
Compiling programs to binaries and then later loading them from disk
is supported by Boost.Compute (and is in fact used to implement the
offline kernel caching infrastructure). However, for the reasons I
mentioned before, this mode is not used exclusively in Boost.Compute
and the algorithms are mainly implemented in terms of the run-time
program creation and compilation model.
Another concern is that Boost.Compute is a header-only library and
doesn't control the build system or how it the library will be loaded.
This limits our ability to pre-compile certain programs and "install"
them for later use by the library.
That said, I am very interested in exploring methods for integrating
OpenCL source files built by the build tool-chain and make loading and
executing them seamless with the rest of Boost.Compute. One approach I
have for this is an "extern_function<>" class which works like
"boost::compute::function<>", but instead of being specified with a
string at run-time, its object code is loaded from a pre-compiled
OpenCL binary on disk. I've also been exploring a clang-plugin-based
approach to simplify embedding OpenCL code in C++ and using it
together with the Boost.Compute algorithms.
There is certainly room for improvement, and I'd be very happy to
collaborate with anyone interested in this sort of work.