Boost logo

Boost :

From: Matt Calabrese (rivorus_at_[hidden])
Date: 2005-10-10 15:12:53


I was working on a physical quantities library last year but stopped when I
saw that there were already a few others working on their own. Now that it
seems as though the others who were working on their implementations have
stopped, I have picked up on work on it again (metamath has made this work
much easier). I was going to wait until I had an uploadable implementation
before I started talking about it, but since others are expressing interest
right now, I'll try to finish up more quickly and present some concepts
here. I've developed solutions to some problems that I noticed other
implementations don't seem to address, and am currently adding optimizations
not previously talked about.

Some things which stand out about my implementation from others is that the
concepts of scalars, vectors, and points are a necessity central to the
system -- in a way that I haven't seen anyone talk about. As well, I use
expression templates to [optionally] optimize expressions at a high level in
a variety of ways. Finally, I take a completely nodular approach to
conversions.

Just so it's a little bit more clear before I upload a version, the reason
that vectors, points, and scalars are so essential is because whenever we
work with units, the logical differences between them exist, even when only
working in 1 dimension, though we generally just don't acknowledge it. In
code, it's essential to express these differences

For instance, I have three different main templates for working with units
(instantiated with unit type, value type, an operations policy for the
provided value type i.e. vector and point operations, all which have
appropriate defaults):

quantity (which is a scalar quantity)
quantity_vector
quantity_point

A quick example of why explicitly separating these types is important is
easily examplified by the common case of temperature. Firstly, consider the
temperature 2 degrees celsius. In order to convert this temperature to
fahrenheit, we would multiply by 9/5 and then add 32. Applying this
operation, we get the result 35.6 degrees celsius, as expected. On the other
hand, consider a temperature difference. Let's say at the beginning of the
day, the temperature was 8 degrees celsius. At the end of the day, the
temperature was 10 degrees celsius. The change in temperature from the
beginning of the day to the end of the day is 2 degrees celsius. Now, to
convert that change in temperature to fahrenheit, we multiply by 9/5. Note
that unlike the previous example, we do not at 32. This result is
3.6degrees fahrenheit, which is obviously very different from
35.6 degrees fahrenheit, despite that both come from a conversion of 2
degrees celsius.

Taking it to a more abstract level, would it ever make sense to add two
temperatures together? For instance, we subtracted 8 degrees from 10 degrees
to get the change in temperature of the day. However, what meaning would we
get out of adding the temperature at the beginning of the day to the
temperature at the end of the day? In general such an operation does not
make any sense.

I believe this is the case for the simple reason that absolute temperatures,
such as the temperature at a particular time of the day, are conceptually
points, while the difference in two temperatures is a vector (just as a
point minus a point yields a vector in geometry). The reason why the shift
of 32 degrees is only applied on absolute temperatures is because you can
think of them as being two different origins the the scaled n-dimensional
space. To convert from one location to another's relative to their origins,
you have to take into account the translation needed to get from one origin
to the other (in this case 32). With vectors, however, this isn't the case,
since a change between two points in similar space, regardless of the
positions of the origin, is merely a translation on its own. All that is
required is the scale change.

Also note that just like with points and vectors, it generally doesn't make
sense to add together two absolute temperatures, whereas it does make sense
to subtract two. As well, it even makes sense to add the result of the
subtraction to an absolute temperature, and the same goes for all other
standard point and vector operations. Finally, just like in geometry, there
are a few places where the rules can be broken regarding adding points, such
as with barycentric combinations. With temperatures, a barycentric
combination could be used to find the average temperature of the day. Just
like with temperatures, the geometric representation applies to all other
unit types. If users find it difficult to understand, most can get away with
just using scalar quantities all of the time, though the more advanced users
would recognize the benefits of using points and vectors.

Indeed, the more you apply this logic, the more it appears as though
vectors, points, and scalars provide a perfect model for working with units,
and I even believe they are entirely essential to a proper design.

What I'm currently working on is finishing up expression templates to
optimize operations. This is done primarly through the rearrangement of
expressions, examining subexpressions of a given full expression, and also
through combining divisors through multiplication prior to evaluating a
ratio (all of which will eventually be individually toggleable, though right
now I'm worried about completing its functionality as a whole).

Expression reorginization can optimize an overall expression quite a bit,
which is why I decided to make it a part of the library. As a simple
example, take the following expression after analysis:

meters1 + feet1 + meters2 + feet2

Without expression rearrangement, a conversion could potentially take place
at every operations. For instance, without rearrangement, the following
logic could occur (obviously it would all be implicit, I am using just made
up functions for clarity here):

convert_to_feet( convert_to_meters( ( convert_to_feet( meters1 ) + feet1 ) )
+ meters2 ) + feet2

However, with expression rearrangement, we can group like terms:

feet1 + feet2 + convert_to_feet( meters1 + meters2 )

Note that the latter example (using rearrangement) only requires 1
conversion as opposed to 3. At first you may say why not arrange the terms
manually, however, you should keep in mind that when working with generic
code and objects passed to functions, you may not know which unit types will
be used at the time of writing the generic code (and it can change depending
on how you use it). After tests, the way I perform expression rearrangement
is resolved entirely at compile-time in VC++ 7.1 ( I haven't yet tested
other compilers). This is toggleable, like the other optimizations regarding
expression templates, will be toggleable once I complete them (currently I
have it working for additive expressions and partially for multiplicative
expressions).

I also use a nodular approach to representing unit conversions. By this, I
mean the "natural unit type" for length can be "meters." When the
"centimeters" unit type is created, it is represented as a conversion
between "meters" and "centimeters" directly. Likewise, when the "feet" unit
is created, its conversion to "meters" is represented directly. However,
with an "inches" type, it is chosen to be expressed as a conversion to
"feet." If you were to imagine this type of system, it forms a tree-like
structure with unit types as nodes. Since the conversions are generally all
known at compile-time and are expressed using compile-time mathematical
types of MPL and metamath, they are still optimized together at
compile-time.

This is beneficial as opposed to a system which describes all conversions
directly going back to the natural unit type of a classification of units
for many reasons. For one, the conversions can often more simply be
expressed between more related types (the conversion from inches to feet is
more clear than inches to meters). In addition, the nodular approach can
prevent loss in precision, since conversion to normally unrelated types
often has to be an approximation. Finally, it is beneficial when working
with types such as money, where the conversion between "pennies" and
"dollars" is known at compile-time, the conversion between two denominations
of another country's currency is known at compile-time, yet the conversion
between american dollars and another unit of money may change at runtime.
Since a nodular approach to conversion is used, conversion between related
currencies is known through a compile-time coefficient, whereas between two
currencies of different countries, the conversion is created and can change
at runtime. The downsides of the runtime conversion isn't necessary unless
you are using two unrelated currencies, unlike if a nodular approach to
representing conversion was not used.

Implementation-wise, I am using MPL maps to represent classifications and
unit types, unlike Andy Little's if I remember correctly, providing more
compile-time efficient type comparisons. I also provide a variety of simple
ways of creating new unit types and classification types using compile-time
metafunctions and variadic template argument lists. The only thing I am
worried about is that I am using a fairly large amount of advanced language
features and I'm currently not concerned about trying to get it to work on
non-compliant compilers.

Aside from everything I mentioned, there are also a few other interesting
things I'd like to talk about later, though I really wanted to get a working
implementation uploaded before getting into a big discussion about it. I'll
try to get expression rearrangement done by the end of the week and upload a
version for people to play with. Feedback would be nice for what I've said
so far.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk