
Boost : 
From: Leland Brown (lelandbrown_at_[hidden])
Date: 20060609 02:50:32
John Phillips <phillips <at> delos.mps.ohiostate.edu> writes:
> The basis for chosing the units is
> instead attempting to maximize numeric stability, and so choosing units
> that keep things as close to order 1 as possible.
BTW, this post will be only of interest to those with a mathematics or
numerical analysis background. Others can safely skip it without missing
anything.
This is a very interesting issue to me! I'm not sure what you have in mind as
far as keeping things close to order 1 improving numerical stability, but let
me explain what comes to mind for me. First, there may be a possibility of
overflow or underflow with values significantly far from order 1, especially
with single precision floating point values. In my application, I use double
precision for everything, so this doesn't concern me.
The second issue relates to things like inversion of illconditioned matrices,
where a matrix may be nearly singular due to a large difference in the orders
of magnitude of its eigenvalues. Clearly this can cause severe numerical
problems, and keeping things near order 1 helps prevent this.
In some cases FWIW I think this problem can be resolved in another way, or even
seen as a symptom of a different problem, which can be addressed  as I'll
explain. And the discipline imposed by stronglytyped dimensional analysis may
force the implementer of the algorithm to correct the problem. (Of course, you
can read this as "annoy the implementer because his mathematically correct
algorithm won't compile," depending on your perspective :) )
Consider, for example, a function that inverts a matrix A using an SVD, in
order to avoid numerical problems. Let's assume A is a symmetric, positive
definite matrix composed of values having different physical dimensions  such
as a covariance matrix of a set of parameters having different dimensions. I
don't use SVD (so forgive me if I display my ignorance); but as I understand,
it would essentially find the eigenvalues and eigenvectors, and invert the
eigenvalues; but for any very small eigenvalues it would replace the very large
reciprocal value with a zero instead. If we're solving Av=b, then loosly
speaking the numerical problem is that many values of v solve the equation,
some with very large coordinate values. Geometrically, the SVD can be
described as choosing from among these values of v the one closest to the
origin.
What has always bugged me about this description is that "closest" requires a
distance metric, like d=sqrt(x*x+y*y), but this is meaningless in a physical
sense if the components x and y have different physical dimensions. Sure, if
you have numerical values in some particular units system, you can do the
computation and get a number. But what doesn't sit right with me is that if
you change your units, you can get a different "closest" point! So even
without roundoff errors and the like, the answer that comes out of the
algorithm will change depending on what physical units are chosen. To me, it
seems that a physical problem should have a particular solution regardless of
the units chosen (aside from issues of computational accuracy).
Likewise, the eigenvalues of this matrix will have some sort of hybrid
dimensions just like the distance metric above. And the eigenvectors can vary
based on choice of units, because even the notion of two vectors being
orthogonal is dependent on the relationship among the units of the parameters
defining the matrix. Because of this, as you pointed out in another post, a
stronglytyped dimensions library would not even be able to compile the SVD
(unless all the elements of the matrix are of the same dimensionality).
And how do I set the threshold of what's a "very small" eigenvalue for the
SVD? Perhaps .000001? In what units, since the matrix elements are of
different dimensions? Again, the meaning of my threshold changes depending on
the relationship among my parameter units.
I used to think all this was just the price you pay for using an SVD. But with
the way I've constructed my matrix, it can be decomposed as:
A = DCD
where D is a diagonal matrix of dimensioned quantities, and C is a
dimensionless symmetric matrix. It's easy, in fact, to find such a diagonal
matrix D, with no a priori knowledge of the magnitude of the values, by taking
the square roots of the diagonal elements of A. Then C is just the correlation
matrix corresponding to the covariance A, and its elements tend to be close to
order 1. Now the matrix to be inverted (C) is likely to avoid numerical
problems caused by large differences in magnitude, and the solution will be
independent of the units chosen  since they will just become multipliers on
the elements of D. The SVD/eigenvalue computation can be done without
violating the strong typing, and the choice of units no longer impacts the
numerical stability of the matrix inversion.
There may be a small penalty (such as calculating the square roots), but as
compared to an SVD algorithm I think it's almost negligible. I expect a
similar technique could be found for cases of nondefinite matrices or non
square matrices, and other types of matrix calculations. In parts of the
algorithm that are not "mixing" dimensions (as the distance formula above
does)  where stronglytyped computations would compile  you're always adding
apples to apples, so the choice of units won't influence numerical problems
like summing values of very different magnitudes. Most matrix multiplications
should probably be typesafe, for instance, and thus independent of units as
far as numerical stability.
These ideas would never have occurred to me when I was just manipulating
matrices as set of numbers. It wasn't until I started to think of them as
stronglytyped physical quantities that this perspective emerged. I think
that's also part of the value of using a dimensional analysis library.
I hope what I've suggested here makes some sense. There may be reasons it
doesn't apply in certain situations. But if anyone finds it helpful, it's
something you can try.
Regards,
 Leland
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk