Current Status
My work on FAST has taken me into the realm of how to represent sequences with cursors and property maps, and how to dispatch algorithms.
Larry Meehan reports that accounts have been set up at:
(2004-11-9)
We'll be using Boost.Build version 2 (BBv2) for all building/testing. I've invested great deal of time recently in trying to grok BBv2, and am working closely with Vladimir Prus (the primary maintainer) to ensure that its documentation is comprehensible, which means going through a massive review/edit cycle.
A project has just been started to rewrite Boost.Build in Python, hopefully with a Scons substrate. The rewrite should yield many advantages, not least freeing Boost.Build developers from the shackles of the odd language built into Boost.Jam, and much smarter target updating logic. Scons is a wonderful build system, and several projects hosted at OSL have apparently started using it. That said, it is very low-level; we want the high-level and platform-/compiler-neutral functionality of Boost.Build.
(2005-1-25)
Right now we are using Docutils and reStructuredText for documentation. We have an automated system called litre (“literate reStructuredtext”) for extracting and testing C++ examples. Serious consideration is being given to the idea of moving to quickbook, not least because we expect the codebase to be more understandable and maintainable. Translating litre to quickbook will require generating some Python bindings, though, as some scripting language integration is crucial.
(2005-1-25)
Iterating between generic interface design and low-level experiments to characterize performance impact of interface design decisions.
Cursors have types that represent their positions. That is to say, a cursor has a different type from each of its neighbors.
single type that can be used to represent cursors for all positions in the sequence. A tuple of different types is a good example of such a sequence.
a cursor's position exists (e.g. a pointer or integer for a fixed-size array), the algorithm can be implemented much more efficiently at compile-time, once the sequence length is known, by moving a homogenous cursor each time the sequence is subdivided.
It should be possible to generalize the support for homogeneous sequences into something that will unroll dynamically-sized sequences as well as fixed-size ones.
This is the cursor/property map equivalent to the segmented iterators described in [Austern98].
[Austern98] | Matthew H. Austern, Segmented Iterators and Hierarchical Algorithms, 1998. Lecture Notes In Computer Science; Vol. 1766 Selected Papers from the International Seminar on Generic Programming, Pages: 80 - 90, ISBN:3-540-41090-2 http://lafstern.org/matt/segmented.pdf |
We don't want to unroll the largest homogeneous sequences completely. Instead it would be better to subdivide them into unrolled chunks, and iterate the unrolled chunks at runtime. Implement this optimization by imposing a segmented view over the fixed-size sequence. This optimization is basically the same as matrix blocking, but in-the-small.
We can start by deciding the maximal amount of loop unrolling that's appropriate for various fixed-sized data structures. We can also decide loop unrolling for some regular variable-sized sequences.
row-/column-major orientations
In which we define concepts such as Ring, Field, LinearOperator, LinearAlgebra, TransposableLinearOperator, AbelianGroup, HilbertSpace, BanachSpace, VectorSpace, and R-Module.
(2005-1-27)
Traditional mathematical concepts are defined in terms of calculations on pure numbers that exhibit no rounding error, but the number types we use every day in numerical linear algebra (e.g., float and double) don't behave quite that well [High02]. In Section 7.1, subsection Equality of Jeremy Siek's preliminary documentation for his early prototype of this project, the notation
boost/tools/build/jam_src/a =ε b
was used to mean “|a - b| < ε where ε is some appropriate small number for the situation (like machine epsilon).” The problem with that is that it's too fuzzy. In particular, according to Andrew Lumsdaine, ordinary floating-point numbers don't actually model Field when notation is used to describe the concept.
One approach to this issue might be to expel the notion of imprecision from the concept taxonomy. Concepts like Field would be require true equality, and we'd deal with the imprecision of floating-point by saying, that if an algorithm requires one of its arguments to model Field and you pass a double (which isn't quite a model of Field), then naturally the algorithm doesn't produce the promised result. Instead, if you pass an approximation of a Field to the algorithm it produces some approximation to the specified result.
That approach is unsatisfying because the error bounds of any algorithm when used with real-life floating datatypes can be calculated, and we'd like our algorithm specifications to be able to make some promises about the magnitude of those errors. Naturally, if you have violated an algorithm's requirements by passing a float where it expects a pure Field, the algorithm can't make any promises at all about the result! Looked at from the other side, if the algorithm can make some guarantees about the result it produces for some input, then whatever the specification says, the input must clearly satisfy some real, underlying requirement.
Only by keeping floating types in the concept taxonomy can we sensibly make guarantees about the precision of algorithms operating on those types. We assert that float and double model a concept called FieldWithError[1], of which Field is a refinement that requires perfect precision. Similar “-WithError” counterparts exist for all the basic algebraic concepts. Just as algorithms like std::binary_search require Forward Iterators` but make stronger efficiency guarantees when passed Random Access Iterators`, numerical algorithms can require their arguments to model the imprecise “-WithError” concepts and make stronger precision guarantees when operating on models of precise algebraic concepts.
This approach has the added benefit of allowing algorithms to be specialized based on refinement. For example, most L/U factorization algorithms involve pivoting steps designed to reduce the magnitude of errors induced by floating-point operations. However, when the element type models a precise algebraic concept (e.g. an infinite-precision rational number type), those pivoting steps are not required. A similar effect occurs in simulations where matrices with the same sparse structure are factored repeatedly: in calculating the sparse structure of the result, a boolean “fill” type that requires no pivoting can be used.
Andrew Lumsdaine notes (2005-1-28) that
“Another simpler example of where things can be sped up in infinite precision case is in just adding up a list of numbers. To do this with high accuracy with floats you want to sort, normalize, etc. With infinite precision, you can just add them up.”
and
“We should probably also distinguish infinite precision from infinite length. I.e., integers can be added without error, but not if they overflow. So perhaps a Bounded concept as well. A float therefore models FinitePrecision and Bounded”
[1] | Pick a different name if you like. |
[High02] | Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, Second edition, SIAM, 2002, xxx+680 pp, ISBN 0-89871-521-0. http://www.ma.man.ac.uk/~higham/asna/ |
Unlike previous incarnations of MTL, we do not plan to use a handle-body implementation for matrices and vectors.
except for views and adapters, which explicitly do not own data, copy constructors should copy (no "handles"). Rationale: this models the well-understood behavior of mathematical primitives. Stack-based and heap-based objects have consistent behavior. As an upshot of both these facts, there is less chance of confusing bugs.
assignment operators should always copy. Views and adapters copy over their target elements when assigned. Rationale: ditto.
Efficiency issues can be handled using library implementations of move semantics. "Perfect" move semantics are possible in most modern compilers today, and with recent developments in the core working group that capability will become mandated (http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#291) and even automatic (http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#391). None of this was available when Jeremy wrote his paper.
Issues of views and reference binding (see http://www.osl.iu.edu/research/mtl/reference/html/MTL_Object_Model.html) can be dealt with by returning const views from adapter functions. For example:
template <class MatrixType> const transpose_view<MatrixType> transpose(MatrixType& m);
consider:
typedef transpose_view<matrix<> > t; typedef transpose_view<matrix<> const> tc;
The library supplies t with const member functions and free functions accepting t const& that can mutate t's referent matrix.
The library only supplies tc with const member functions and free functions accepting tc const& that cannot mutate tc's referent matrix.
Enough support so that vectors model VectorSpace and vectors + matrices model Linear Algebra.
Support operator notation for implemented algorithms.
Add Storage and corresponding Shape aspects.
Note
Triangular can be seen as a special case of banded.
Applies to banded and triangular shapes
Applies to triangular shape
Applies to banded shape
Applies to diagonal orientation
Note
probably involves blocked view of dense matrix
New data structure modeling Linear Algebra when combined with Vector. Blocking should be exploited for fast Matrix Vector product
Note
Fast addition may be too hard to do.
New data structure modeling Linear Algebra when combined with Vector. Blocking should be exploited for fast Matrix Vector product
Note
Fast addition may be too hard to do.
Note
Don't worry about making all combinations fast
Is there special data structure work?
Incorporate parallelism in conjunction with parallel BGL