Boost logo

Boost :

From: David Abrahams (abrahams_at_[hidden])
Date: 2001-03-13 18:19:43


----- Original Message -----
From: "Dean Calver" <deano_at_[hidden]>

> I'm very interested in vectors and matrices but I think that it would be
> nice to specify the interface in ways that will allow vendors to implement
> platform specific versions, most processors today support vector
operations
> upto 4 floats or 8 integers as a basic type. Most of the numerics
libraries
> I have seen seem to ignore the hardware support that exists in processors
> today.

It should be possible to structure any library so that it is possible to
write specializations taking advantage of hardware accelleration.

> I should note that the comments I'm making are for 'small' vector/matrix
> operation max 8x8 matrixes and are likely complete rubbish for proper
> numerics :-)

Not neccessarily. Many of the problems I will have to solve involve matrices
of "blocks" (small matrices).

> Some issues I (already) foresee if we were to try are
> 1) returning references.
> If a processor has a built in type, say a 128bit 4 float vector then we
> should NOT return references.

Probably not.

> Good: mathvector<float,4> something();
> Bad: mathvector<float,4>& something();
> but if not hardware we should? use a reference to stop having to copy a
> structure around.

You can't generally use the "bad" form anyway unless the mathvector<> is
already stored somewhere. If something(a, b) returns a + b you are out of
luck. Things like Blitz++ use reference-counting to avoid copies. Probably
we'd not want to reference-count anything as small as 4 floats regardless of
hardware support.

> 2) partials.
> Most vector units have a fixed size (e.g. 4 floats) but like to have
> partials type treated specially.
> mathvector<float,4>a;
> float b = 10.f;
> a = a * b; // likely to be slow as it has to move to/from memory across
> registers etc
> partial1_mathvector<float,4> c = 10.f; // there has to be many special
> partials
> a = a * c; // likely to be faster register to register operation

I'm thoroughly lost in the above code.

> 3) multiple units.
> Many processors have multiple units that can be used simultanously, using
> an STL memory allocator like system would allow this when speed is really
> important (not portable probably) something like this.
> mathvector<float,4, VECTOR_UNIT0> va,vb;
> mathvector<int,8, MMI> ia,ib;
> mathvector<float,4, FPU> fa,fb;
> va = va * vb;
> ia = ia * ib;
> fa = fa * fb;

I hope I never have reason to write code like that ;-)
-D


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk