Boost logo

Boost :

Subject: Re: [boost] Back to Boost.SIMD - Some performances ...
From: Stefan Seefeld (seefeld_at_[hidden])
Date: 2009-03-26 16:00:16

Joel Falcou wrote:
> Michael Fawcett a écrit :
>> Joel, how does the extension detection mechanism work? Is there as
>> mall runtime penalty for each function as it detects which path would
>> be optimal, or can you define at compile-time what extensions are
>> available (e.g. if you are compiling for a fixed hardware platform,
>> like a console).
> I have a #ifdef/#elif structure that detects which extension have been
> set up ont he compiler and I match this with a platform detection to
> know where to jump and how to overload some functions or class
> definition.
> I tried the runtime way and it was fugly slow. So I'm back to a
> compile-time detection as performance was critical.
Actually, I would expect this to be a mix of runtime and compile-time
decision. While there are certainly things that can be decided at
compile-time (architecture, available extensions, data types), there are
also parameter that are only available at runtime, such as alignment,
problem size, etc.

In Sourcery VSIPL++ ( we use
a dispatch mechanism that allows programmers to chain extension
'evaluators' in a type-list, and this type-list is then walked over once
by the compiler to eliminate unavailable matches, and the resulting list
at runtime to find a match based on the above runtime parameters. This
is also where we parametrize for what sizes we want to dispatch to a
given backend (for example if the performance gain outmatches the data
I/O penalty, etc.).

Obviously, all this wouldn't make sense on a very fine-grained level.
But for typical blas-level or signal-processing operations (matrix
multiply, FFT, etc.) this works like a charm.

(We target all sorts of hardware, from clusters over Cell processors
down to GPUs.)


       ...ich hab' noch einen Koffer in Berlin...

Boost list run by bdawes at, gregod at, cpdaniel at, john at