Boost logo

Boost :

Subject: Re: [boost] [gsoc] boost.simd news from the front.
From: David A. Greene (greened_at_[hidden])
Date: 2011-06-11 12:54:16


Joel falcou <joel.falcou_at_[hidden]> writes:

> On 11/06/11 10:45, David A. Greene wrote:
>
>> Vectorizing compilers exist today. They've existing since the 1970's.
>
> I still thinking you're mixing SIMD ISA and vector machine ...

They are the same, though existing SIMD architectures are less powerful.

> As I see it, Vector machine as in
> http://en.wikipedia.org/wiki/SIMD#Chronology existed since 197x. BUT,
> what you purposely fake to not understand is that we don't care about
> this type of machine and focus on this
> http://en.wikipedia.org/wiki/SIMD#Hardware which exists since 1994 or
> such and for which the use cases, idioms and techniques are completely
> different.

Not completely. They are different in the sense that the SIMD hardware
doesn't provide as many facilities as the old vector machines. But the
principles are exactly the same. Vector codegen is harder on the SIMD
machines and it's for this reason that I think boost.simd may not always
generate the best code. It's really, really not always obvious what
instructions should implement an arbitrary expression.

> So yeah, autovectorization on huge CRAY system is done automatically
> with results I really don't know about and for which i dont care as it
> is not our target audience but I concede it is maybe good or w/e.

The very same compiler vectorizes very well for x86.

> Now, strictly speaking on SIMD ISA in x86/PPC familly, no,
> auto-vectorizing is not that good and still require manual input,
> functions library and so forth. So in this case, our claims hold and
> boost.simd has to been has a set of enabling tools for helping people
> writing generic code able to be vectorized.

There are very good autovectorizors for x86. I'm not very familiar with
PPC so I can't comment on that.

> Learn that Template Meta-Programming as we use it there don't prevent
> compiler to do stuff afterward our code generation process. They
> usually do (inlining or loop unrolling etc ...) and we let them do so
> as they wish cause they fill the gaps we can not access with our
> library.

But the resulting vector code may run suboptimally. It will probably
run fine 80% of the time, but for that other 20% it may be very bad
indeed. This is where allowing the compiler to do vector codegen is
critical.

> Did you ever go to the last slides where we have the Range based
> functions code where no SIMD stuff leak but yet is able to eat up
> SIMDRange ?

Yes, I think it's very neat!

> I fail to see how this is *not* a correct way of designing
> code in C++ : - relying on range abstraction + providing a Range with
> proper properties. And this kind of code is largely sufficent and
> generic to handle most classical use cases with high level of
> performances. Dealing with microarchitecture never went that far as
> manual code goes, and I don't really get your obsession with that.

My concern with it comes from experience tuning a vectorizing compiler
to many different microarchitectures.

If the range abstraction is correct and the underlying libraries provide
hints to the compiler (the latter is a big missing piece, I admit), the
compiler ought to be able to generate the same or better vector code
from the high-level algoritm call. If it doesn't, it's because the
compiler is not up to the task or the generic library somehow hides
information from the compiler. I contend that it is often easier to add
directives to the library than it is to use boost.simd to generate
vector code, and the former will often perform better.

If one doesn't have a good vectorizing compiler, the point is moot
because the directives probably don't even exist. In that case
boost.simd is the way to go.

My pithy quip to colleagues is that std::vector<> ought to vectorize.
:) It likely doesn't today, at least not with the default allocator.
But that is largely a problem with the library implementation not
specifying the lack of aliasing issues to the compiler and/or the
higher-level standard algorithms not including directives to tell the
compiler to ignore possible depedencies.

Does that mean that boost.simd is a waste of time? Not at all! There
will always be cases the compiler for whatever reason cannot get. In
those cases, boost.simd is the far superior way to approach the problem
over hand-written vector code.

                                -Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk