Subject: Re: [boost] [gsoc] boost.simd news from the front.
From: Joel falcou (joel.falcou_at_[hidden])
Date: 2011-06-10 17:33:37
On 10/06/11 15:16, David A. Greene wrote:
> I don't think this presentation makes the case for this library. That
> said, I am very glad you and others are thinking about these problems.
> Almost everything the compiler needs to vectorize well that it does not
> get from most language syntax can be summed up by two concepts: aliasing
> and alignment.
No. How can a compiler vectorized a function in another binary .o ?
Like who is gonna vectorize cos and its ilk ?
> I don't see how pack<> addresses the aliasing problem in any way that is
> not similar to simply grabbing local copies of global data or
> parameters. Various C++ "restrict" extensions already address the
> latter. We desperately need something much better than "restrict" in
> standard C++. Manycore is the future and parallel processing is the new
If you read the slides, you would have seen that pack is like the
messenger of the whole sidm range system which is fitting right into
*higher level of abstraction* and not some piggy backing of the compiler.
> pack<> does address alignment, but it's overkill. It's also
> pessimistic. One does not always need aligned data to vectorize, so the
> conditions placed on pack<> are too restrictive. Furthermore, the
> alignment information pack<> does convey will likely get lost in the
> depths of the compiler, leading to suboptimal code generation unless
> that alignment information is available elsewhere (and it often is).
Well, my benchmarks disagree with this. See this old post of mine one
year ago about the same subject. If getting 95% of peak performances
is pessimistic, then sorry.
> I think a far more useful design of this library would be providing
> standard ways to assert certain conditions. For example:
No. Range that accept SIMD operations are a perfect HL feature. We are
writing a library not an extension for compilers.
> What's under the operators on pack<>? Is it assembly code?
No as naked assembly prevent proper inlining and other register based
compiler optimisation. We use w/e intirnsic is avialable for the current
compiler/architecture at hand.
> I wonder how pack<T> can know the best vector length. That is highly,
> highly code- and implementation-dependent.
No. On SSEx machine, SIMD vector are 128 bits, this means pack<T,
sizeof(T)/16> is optimal so a simple meta-function finds it.
> How does simd::where define pack<> elements of the result where the
> condition is false? Often the best solution is to leave them undefined
> but your example seems to require maintaining current values.
This make no sense. False is [0 ... 0] True is [ ~0 ... ~0]. Period.
SIMD is all about branchless, so everything is computed in the whole
vector. I seems to me you didnt get that pack is NOT a data container
but a layer above SIMD registers that then get hidden under concept of
> How portable is Boost.simd? By portable I mean, how easy is it to move
> the code from one machine to another get the same level of performance?
Works on gcc, msvc, sse and altivec, and we started looking at ARM NEON.
Most of these have the same level of performance
> I don't mean to be too discouraging. But a library to do this kind of
> stuff seems archaic to me. It was archaic when Intel introduced MMX.
> If possible, I would like to see this evolve into a library to convey
> information to the compiler.
I'll keep my archaic stuff giving me a x4-x8
speed up rather than waiting for compiler based solution nobody were
able to give me since 1999 ...
We already had this discussion two years ago, so i am not keen to go all
over again as it clearly seems you are just retelling the same FUD that
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk