Boost logo

Boost :

Subject: Re: [boost] Accelerating algorithms with SIMD - Segmented iterators and alternatives
From: Smith, Jacob N (jacob.n.smith_at_[hidden])
Date: 2010-10-11 19:12:45


> -----Original Message-----
> From: boost-bounces_at_[hidden] [mailto:boost-
> bounces_at_[hidden]] On Behalf Of Simonson, Lucanus J
> Sent: Monday, October 11, 2010 3:34 PM
> To: boost_at_[hidden]
> Subject: Re: [boost] Accelerating algorithms with SIMD - Segmented
> iterators and alternatives
>
> Mathias Gaunard wrote:
> > On 11/10/2010 17:54, DE wrote:
> >> hi there
> >> if you focus on simd-aware - and more specifically x86 simd -
> >> implementation _DON'T_ read further
> >> consider OpenCL as a general way to speed up computations (which
> uses
> >> simd as one of backends as well as gpu shader units etc.)
> >> i think it will be much more general and useful
> >
> > OpenCL is a possible implementation, albeit we choose to call the
> > various SIMD instructions ourselves to have more control on the
> > toolchain and the end result.
> > OpenCL will however be our main backend for GPU targets which we will
> > be supporting in the future. Or maybe we will just target it through
> > Clang
> > and LLVM.
> >
> > NT2 (also know as the crazy frenchman library), upon which this
> effort
> > is based, only supports x86 (SSE, ..., SSE4, AVX) and PowerPC
> > (AltiVec).
> > ARM (NEON, VFP) is being added.
> > An effort has been made in its design so that instructions could
> > register themselves for a particular (type, cardinal) pair, all of
> > which ranked according to a category to select the best candidate. It
> > heavily
> > uses meta-programming, including rewritten bits of MPL to augment its
> > compile-time performance.
> > It also uses expression templates with Proto to detect certain
> > patterns,
> > such as fused multiply-add, for which x86 is introducing new
> > instructions soon.
> > Therefore it is very tunable and extensible.
> >
> > The bit I wanted to discuss here, however, is not the implementation,
> > but rather the interface that the library provides to the programmer.
> >
> > We aim to provide an interface in modern C++ that integrates well
> with
> > the standard library and Boost in order to allow developers to make
> > use
> > of SIMD in an easy and fairly high-level fashion, potentially using
> > meta-programming to write an algorithm with parametric register and
> > cache sizes.
> > OpenCL is not an interface that satisfies those criteria.
>
> Have you seen the ct thing coming from Intel? I just learned about it
> last week.
>
> http://software.intel.com/en-us/articles/intel-array-building-blocks/
>
> It looks like a container library for vector processing in C++ with a
> JIT compilation runtime environment as part of the library. As a guy
> who appreciates compilers, langauges and libraries there is a lot to
> like there. This area is evolving very fast.

Assuredly I missed some sort of presentation or a discussion elsewhere, but why is a segmented-iterator/SIMD-oriented-iterator (or ct) superior or worse than a well-tuned valarray? Obviously, valarray has a uniquely sparse interface providing (less than) the barebones needed for data-parallel programming; on top of which it uses a member-function strategy. However, it seems that "fixing" valarray might be a more robust start, rather than immediately striking out on a different strategy.

> Regards,
> Luke
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk