Boost logo

Boost :

Subject: Re: [boost] Back to Boost.SIMD - Some performances ...
From: Stefan Seefeld (seefeld_at_[hidden])
Date: 2009-03-26 16:00:16


Joel Falcou wrote:
> Michael Fawcett a écrit :
>> Joel, how does the extension detection mechanism work? Is there as
>> mall runtime penalty for each function as it detects which path would
>> be optimal, or can you define at compile-time what extensions are
>> available (e.g. if you are compiling for a fixed hardware platform,
>> like a console).
> I have a #ifdef/#elif structure that detects which extension have been
> set up ont he compiler and I match this with a platform detection to
> know where to jump and how to overload some functions or class
> definition.
>
> I tried the runtime way and it was fugly slow. So I'm back to a
> compile-time detection as performance was critical.
>
Actually, I would expect this to be a mix of runtime and compile-time
decision. While there are certainly things that can be decided at
compile-time (architecture, available extensions, data types), there are
also parameter that are only available at runtime, such as alignment,
problem size, etc.

In Sourcery VSIPL++ (http://www.codesourcery.com/vsiplplusplus/) we use
a dispatch mechanism that allows programmers to chain extension
'evaluators' in a type-list, and this type-list is then walked over once
by the compiler to eliminate unavailable matches, and the resulting list
at runtime to find a match based on the above runtime parameters. This
is also where we parametrize for what sizes we want to dispatch to a
given backend (for example if the performance gain outmatches the data
I/O penalty, etc.).

Obviously, all this wouldn't make sense on a very fine-grained level.
But for typical blas-level or signal-processing operations (matrix
multiply, FFT, etc.) this works like a charm.

(We target all sorts of hardware, from clusters over Cell processors
down to GPUs.)

Regards,
       Stefan

-- 
       ...ich hab' noch einen Koffer in Berlin...

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk