Boost logo

Boost :

Subject: Re: [boost] [gsoc] boost.simd news from the front.
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2011-06-10 20:27:32


David A. Greene wrote:
> For writing new code, I contend that a good compiler, with a few
> directives here and there, can accomplish the same result as this
> library and with less programmer effort.

That's great, why don't you write one and then there will be one. Let me know when you're done. ;)

>> Give me one example of non-EP code which needs and can be vectorized.
>
> Many codes are not embarrassingly vectorizable. Compilers have to
> jump through major loop restructuring and others things to expose the
> parallelism. Often users have to do it for the compiler, just as they
> would for this library. Users don't like to manually restructure
> their
> code, but they are ok with putting directives in the code to tell the
> compiler what to do.
>
> A simple example:
>
> void foo(float *a, float *b, int n)
> {
> for (int i = 0; i < n; ++i)
> a[i] = b[i];
> }
>
> This is not obviously parallel but with some simple help the user can
> get the compiler to vectorize it.

No, the compiler will substitute a couple moves and a long jump to __memcpy assembly code. Memcpy is vectorized by hand. The compiler does not need help of any kind to vectorize this loop, but this is a special case. Also, memcpy is clearly EP. All simple filters that are embarasingly EP are some operation done in the middle of what is effectively a memcpy, sometimes the source and destination are the same. You have the same vector loads and stores and the only difference is you actually do something with the data while it is register. The compiler should never vectorize memcpy when it has the better option of substituting the hand coded assembly for memcpy. If you somehow tricked it into generating vector code instead of __memcpy you would get worse performance.

>> MSVC does not, neither xlC ... neither clang ... so which compilers
>> takes random crap C code and vectorize it automagically ?
>
> Intel and PGI.

Even the best vectorizing compilers for C++ are still terrible. I think we should use "good" in an absolute rather than relative sense.

Regards,
Luke


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk