|
Boost : |
Subject: Re: [boost] Going forward with Boost.SIMD
From: Niall Douglas (ndouglas_at_[hidden])
Date: 2013-04-23 13:19:07
> A GPU is an accelerator for large regular computations, and requiring
sending
> memory and receiving it back. It's also programmed with a very constrained
> programming model that cannot express efficiently all kinds of operations.
>
> A CPU, on the other hand, is a very flexible processor and all memory is
already
> there. You can make it do a lot of complex computations, irregular, sparse
or
> iterative, can do dynamic scheduling and work stealing, and have
fine-grained
> control on all components and how they work together.
:)
You're actually wrong on that, and it's one of the first big surprises
anyone who sits on ISO committees experiences: the change in scope of
definitions. When you're coming at things from the level of international
engineering standards, a computer's CPU is not defined as anything
approximating what any of us use on a regular basis. It includes large NUMA
clusters, it includes Cray supercomputers all of which don't do SIMD
anything like how a PC does. It *also* includes tiny embedded 8-bit CPUs,
the kind you find in watches, inlined in wiring, that sort of thing. Some of
those tiny CPUs, believe it or not, do SIMD and have done SIMD for donkey's
years, but it's in a very primitive way. Some of those CPUs, for example,
work in SIMD 3 x 8 bit = 24-bit or even 3 x 9 bit = 27-bit not 32-bit
integers, that sort of thing. Yet international engineering standards must
*always* target the conservative majority, and PCs or even CPUs designed
more recently than the 1990s are always in a minority in that frame of
reference.
Don't get me wrong: you could standardize desktop class SIMD on its own. But
generally you need to hear noise complaining about the costs of lack of
standardization, and I'm not aware of much regarding SIMD on CPUs (it's
different on GPUs where hedge funds and oil/gas frackers regularly battle
lack of interop).
> However, SIMD has been here for 25 years and is still in the roadmap of
future
> processors. Across all this time it has mostly stayed the same.
>
> On the other hand GPU computing is relatively new and is evolving a lot.
> It's also quite trendy and buzzword-y, and is in reality not as fast and
versatile as
> marketing makes it out to be.
> A lot of people seem to be intent on standardizing GPU technology rather
than
> SIMD technology; that's quite a shame.
Thing is, had Intel decided Larrabee was worth pushing to the mass market -
and it was a close thing - PC based SIMD would look completely different now
and we wouldn't be using SSE/NEON/AVX which is really an explicit prefetch
opcode set for directly programming the ALUs and bypassing the out of order
logic, not true SIMD (these are Intel's own words, not mine). As it is,
convergence will simply take longer. Some of those new ARM dense cluster
servers look awfully like Larrabee actually, 1000+ NUMA ARM Cortex A9's in a
single rack, and their density appears to be growing exponentially for now.
Given all this change going on, I'd still wait and see.
Niall
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk