Subject: Re: [boost] [gsoc] boost.simd news from the front.
From: Joel falcou (joel.falcou_at_[hidden])
Date: 2011-06-10 18:35:17
On 10/06/11 17:09, David A. Greene wrote:
> It's not a high level of abstraction. It's a very low level one. Users
> are barely willing to restructure loops to enable vectorization. Many
> will be unwilling to rewrite them completely. On the other hand, the
> data show that they are quite willing to add directives here and there.
If range are not higher level than for loop, I think we can stop
discussing right here.
> On what code? It's quite easy to achieve that on something like a
> DGEMM. DGEMM is also an embarrassingly vectorizable code.
Give me one example of non-EP code which needs and can be vectorized.
> That's effectively assembly code.
> No. On SSEx machines, a vector of 32-bit floats can have 1, 2, 3 or 4
No, SSE2 __m128 contains 4 floats. Period.
> Consider AVX. This is _not_ an easy problem to solve. It is not always
> the right answer to vectorize using the fully available vector length.
AVX has 256 bits register and fits 8 floats. Again, what did I miss ?
> I know what a pack<> is. Perhaps I wasn't clear. If I have an
> operation (say, negation) under where() in which the even condition
> elements are true and the odd condition elements are false, what is the
> produced result for the odd elements of the result vector?
where is ?:. It requires three argument. I tempted to say RTFM.
a = c ? b; is not valid code, so neither is where(c,a);
The more it goes and the more it looks like you didnt read the slides
> What happens if you move the code from Nehalem to Barcelona? How about
> from an NVIDIA GPU to Nehalem?
Where did I say this stuff targeted GPU. This is a friggin strawman
there. We address in-CPU vectorization, this is the scope of the
library. Period again. We dont claim solving arbitrary data parallelism
problem and we never did. You are again recycling the same non argument
than in your last intervention on this very topic last year.
> Compilers have been doing this since the '70's. gcc is not an adequate
> compiler in this respect, but it is slowly getting there.
MSVC does not, neither xlC ... neither clang ... so which compilers
takes random crap C code and vectorize it automagically ?
> It's not FUD. It's my experience.
It is really, FUD and strawmen.