Boost logo

Boost :

Subject: Re: [boost] Accelerating algorithms with SIMD - Segmented iterators and alternatives
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2010-10-13 16:43:17


[Repost: first one doesn't appear to have made it through]

Just trying to refocus this thread to what I meant it to be.

We have a SIMD abstraction layer that we would like to eventually submit
to Boost.
For appropriate values of N, simd::pack<T, N> represents a SIMD
register, which allows you to do the same operation on N elements in
parallel with a single CPU instruction. pack<T, N> provides all the same
operators as T, and can also detect a sequence of operations that exists
on the CPU as a single instruction. It falls back to a loop if the
architecture has no SIMD register of size sizeof(T)*N bytes.
simd::pack<T, N> is also a fusion sequence and a range.

The library also provides a series of useful functions, like summing a
pack or reordering its elements.

What I would like to know, is how people think we could integrate this
system into iterators and ranges so that existing algorithms could be
adapted to treat N elements at a time instead of 1, and therefore get a
potential speed gain.

As I said, we currently have an iterator adapter that adapts a range of
properly aligned Ts, that are also contiguous at least in chunks of N
elements, into a range of pack<T, N>s.

Are there other utilities we could provide to help with the usage of SIMD?
I was thinking of supporting adapting non-aligned ranges as well and
padding them with some values, but that is not possible to do
efficiently with the standard iterator model, which led to some
discussion about an alternative iterator model that seems to be going
nowhere.

Any suggestion of features or tools the SIMD could provide to make the
life of the developer easier would be appreciated.

With regards to alignment, we also provide a memory allocator that
aligns correctly, and functions to get the next or previous aligned
address from a particular address.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk