From: Stefan Seefeld (stefan_at_[hidden])
Date: 2019-10-20 20:50:35
On 2019-10-20 7:00 a.m., Olzhas Zhumabek wrote:
> I found out that to be able to call any GIL functions from device (e.g.
> GPU), they have to be marked __device__ (and __host__ too, to call from CPU
> side as well) recursively. So I guess I'll abandon the project. My plan for
> now is to write a thin wrapper around image, image_view and kernel. The
> contents will be copied by cudaMemcpy and the like, and then layout
> compatible structs will be passed to device side. Thanks for helping me out
> with the experiment. I'll write back when I'll manage to implement some
> basic IP algorithms.
While it's true that functions need to be marked up individually to be
compiled into device- and host-code, the problem isn't so much that this
requires such markup to be added. Rather, it is a reminder that not all
code is suitable to be executed on GPUs.
Writing GPU-ready code (be it for OpenCL or CUDA) requires a fair bit of
design to plan the data flow (decision-heavy code run on the host,
data-parallel code run on the GPU, all the while minimizing data
transfers between host and device memory). It's certainly more involved
than just sticking some macro in front of all the functions and letting
it expand to either __device__ or __host__.
All that being said, I still think this would be a nice project for GIL
(perhaps even GSoC ?), and might yield interesting performance
improvements for a number of frequent algorithms.
-- ...ich hab' noch einen Koffer in Berlin...
Boost list run by Boost-Gil-Owners