Subject: Re: [boost] [xint] Fourth release, requesting preliminary review again
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2010-06-11 18:00:52
Chad Nelson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> On 06/11/2010 01:53 AM, Joel Falcou wrote:
>>> At Thu, 10 Jun 2010 18:16:40 -0400,
>>> Chad Nelson wrote:
>>>> Eventually I'd like to code the lower-level operations in
>>>> assembly, but that isn't going to happen for the first public
>>>> release. And even after I do, I'd still want to maintain a pure
>>>> C++ version, for maximum portability.
>> you can have both protability and SIMD stuff in clean , easy to read
>> C++. It's not that hard, just long and tiring.
> Do you have references on how to do it? If it can be done, I'd very
> like to do it for XInt.
You can wrap usage of intrinsics with templates and create simd data types in C++ that use SSE when available and emulate it in software when not available. Doing so would be a boost library unto itself, which has been proposed many times and generally not well received because it favors which ever instruction set it chooses to target and is therefore not general. It would also be a maintanence headache because new revisions of the instruction set would require constant revisions of the library to keep up. For complicated reasons that become obvious once you comparatively study several vector instruction sets it is not practical (or really feasible for that matter) to target multiple instruction sets with such a library. You write your algorithms differently to take advantage of different instructions. A compromise is a slippery slope toward writing a cross compiler. There is a school of thought that anything that can be done in the compiler can be done better in a library. If we can't implement a good vectorizing compiler as a compiler it isn't reasonable to think we can do so as a library. What it boils down to is that you have to write a different wrapper for each instruction set and you end up implementing your algorithms multiple times anyway. I can understand using some assembly to access special instructions related to concurrency in threading libraries submitted to boost. It is pretty easy to case out the different target platforms and the number of places you need to do it is small. Writing large numbers of non-trivial algorithms in machine specific code is another matter all together. This is sufficiently hard that the best people in the world collaborate to produce GMP that they can share and the hardware vendors typically pitch in with this sort of thing to help themselves help the customer use new hardware features. Even when it is an open source effort the people contributing are often paid.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk