Subject: Re: [boost] Accelerating algorithms with SIMD - Segmented iterators and alternatives
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2010-10-11 18:14:00
On 11/10/2010 17:54, DE wrote:
> hi there
> if you focus on simd-aware - and more specifically x86 simd -
> implementation _DON'T_ read further
> consider OpenCL as a general way to speed up computations (which uses
> simd as one of backends as well as gpu shader units etc.)
> i think it will be much more general and useful
OpenCL is a possible implementation, albeit we choose to call the
various SIMD instructions ourselves to have more control on the
toolchain and the end result.
OpenCL will however be our main backend for GPU targets which we will be
supporting in the future. Or maybe we will just target it through Clang
NT2 (also know as the crazy frenchman library), upon which this effort
is based, only supports x86 (SSE, ..., SSE4, AVX) and PowerPC (AltiVec).
ARM (NEON, VFP) is being added.
An effort has been made in its design so that instructions could
register themselves for a particular (type, cardinal) pair, all of which
ranked according to a category to select the best candidate. It heavily
uses meta-programming, including rewritten bits of MPL to augment its
It also uses expression templates with Proto to detect certain patterns,
such as fused multiply-add, for which x86 is introducing new
Therefore it is very tunable and extensible.
The bit I wanted to discuss here, however, is not the implementation,
but rather the interface that the library provides to the programmer.
We aim to provide an interface in modern C++ that integrates well with
the standard library and Boost in order to allow developers to make use
of SIMD in an easy and fairly high-level fashion, potentially using
meta-programming to write an algorithm with parametric register and
OpenCL is not an interface that satisfies those criteria.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk