Subject: Re: [boost] GSoC SIMD project
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2011-03-01 08:13:55
On 01/03/2011 00:19, Antoine de Maricourt wrote:
>> Source is at <http://github.com/MetaScale/nt2/> in include/nt2/sdk/simd
>> but it's not quite user-friendly yet (no docs etc.).
>> Announcements will be made when docs and tutorials are available. Some
>> good tutorials will be made at Boostcon ;).
> That's great.
> I have been working for two or three months now on a low level core
> library, but with possibly different goals. I was not that much
> interested into developping a comprehensive math library, but mostly
> into being able to target GPUs.
We've decided that math functions (trigonometric, exponential, etc.)
wouldn't be in the SIMD library but only in NT2.
While they use the SIMD library, those don't have any platform-specific
code anyway. (at least I believe so, haven't seen the code of all of them)
> I came out with proto being used as a front end and driving a back end
> that generates GPU specific source code (OpenCL, or vendor specific
> langage such as AMD CAL IL, or even CPU instructions using SSE Intel
> intrinsics for instance).
Doing that kind of thing is only possible when you have a substantial
amount of code, such as a whole function or a code kernel.
That's out of the scope of the SIMD library, which only aims at
providing a portable and efficient set of SIMD operations, and recognize
certain combination patterns to map them to the optimized primitives.
You could, however, compile a bunch of code at runtime using the library
to achieve the desired effect.
> The generated code is then compiled again and
> run inside GPU runtime environment. However, this is probably very
> simple minded compared to NT2, and given the time I spent on it.
> So, is NT2 able to target GPU? and to take into acount GPU's programing
Yes NT2 can do that (or will be able to, rather, since that code is not
in the public repository yet), but that works at a way higher level of
abstraction than the SIMD component that we're proposing for Boost (it
works in terms of series of operations on multidimensional tables of
arbitrary size, while the SIMD library only works with SIMD registers of
for example 128 or 256 bits).
The two GPU backends (OpenCL and CUDA) are not released to the public
yet because we're still considering commercial ventures with these.
The OpenCL backend generates and compiles code at runtime from a proto
expression, the CUDA one does something smarter.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk