
Boost : 
From: Michael Kenniston (Msk_at_[hidden])
Date: 20010702 00:41:36
As the original instigator of this thread, it's probably time for me
to chime in (though it's also been fun to just sit back and watch).
Many thanks to all for the feedback on the Physical Quantity library
proposal. As I'd hoped and expected, you raised many issues that
had never even occurred to me. I've tried to merge some of the comments
together and respond to them grouped by topic.
Just to set the context, I've tried to be consistent throughout the
following in my use of the terms "units" and "dimensions" (feet and
meters are units; area and volume are dimensions). I also assume
that the basic units like meter are defined as inline functions like
"meter()" so that the optimizer has a fighting chance of doing what
you want it to.
 EXISTING PRACTICE 
Paul Baxter wrote:
> A possible solution that unfortunately isn't (currently) Boost license
> compatible may be found at
> http://www.fnal.gov/fermitools/abstracts/siunits/abstract.html
Greg Colvin wrote:
> What exactly is the problem with the license at
> ftp://ftp.fnal.gov/pub/siunits/README?
> It was my understanding that the author, Walter Brown, was
> intending to see this work into Boost and ultimately the
> next C++ standard.
I was not aware of the SIunits library (ironic, since my wife and I
routinely attend concerts and talks at Fermilab). That library is
quite comprehensive, including parameterization of numeric
representation type, parameterization of calibration (i.e. the
number used to represent a base unit internally), and about five
different models of the structure of the universe (e.g. includes a
model where time is measured in meters). I have emailed them to
find out their current status and plans with respect to boost and
am awaiting a reply.
As I read it, the stumbling block on the license is whether the
license file has to be included with executable distributions.
(Well, that and the "simple to read and understand" requirement).
If it is required, then SIunits isn't boostible, but I can't
tell from the license text whether that's what they meant.
I'll ask about that, too.
In the meantime, I'll continue to play with my (much less featureful)
implementation, since I need something simpler for a CUJ article
anyway to act as an understandable example. After fumbling down lots
of blind alleys, I was able to come up with code that would compile and
run under both g++ and MSVC. All it took was elimination of
outofclass
member definitions, elimination of template friend declarations,
transformations of all compiletime arithmetic into enum definitions,
and the addition of a couple of ugly helper classes. :( This in itself
might be a worthwhile contribution, since the SIunits documentation
claims only severely degraded functionality under MSVC.
Paul Baxter wrote:
> Actually they have several other packages [at Fermilab] that look
extremely useful.
<snip>
> So much so that there is almost an uncanny parallel with Boost. Its a
shame
> efforts aren't shared between the two projects to provide highquality
> license free class libraries.
If we can get some collaboration going for one library, maybe it
will set a precedent.
Deane_Yang wrote:
> I have implemented template base classes representing
> units very successfully.
Would you be willing to share your code? Taking the best of three
different implementations should produce something pretty decent.
Deane_Yang wrote:
> Well, at this point it becomes embarassingly little, especially if
> you use operators.hpp (My original implementations predated my
> knowledge of boost and do not use operators.hpp.). I can probably
> post the examples some time. They should be real short.
We won't mind. Short examples are easier to understand. :)
Bsides, at this point we could all still have radically different
implementation ideas without realizing it. (E.g. the code snippet
posted by Corwin Joy was structured very differently from what I
had in mind.) Sometime in the next week I'll try to upload my
stuff as well.
Deane_Yang wrote:
> 1) Unit classes are easily implemented in terms of operators.hpp
Well, yes and no. Things get a little strange, for example
"length / time" is allowed, but "length /= time" is not,
because the type of "length / time" is "speed". Multiplication
has a similar odd property of the result type being different
from either of the operand types.
 CASTING AWAY DIMENSIONS 
John Max Skaller wrote:
> At some stage,
> you need to 'cast away' dimensions, do a pure computation,
> then add the dimensions back.
> sin ( undim(x) ) metres()
I disagree, and in fact I claim that there should be no 'cast away'
operation defined at all. If you /really/ want to take the sine of
the number of meters or feet in the physical quantity "x", that's easy:
sin( x / meter() ) * meter()
or
sin( x / foot() ) * foot()
That seems like a strange thing to do, since the answer will be
completely different depending on whether you use meters or feet, but
 you can do it, and
 the syntax makes it clear exactly what you're doing, and
 there should be no runtime cost; the optimizer should
collapse this to a single call to sin(), with a
multiply and divide if needed.
More realistically, for computing the sine of actual angles, you
could either define an overload:
inline sin( plane_angle_t a ) { return sin( a / radian() ); }
or just do it in the argument list:
sin( a / radian() )
The syntax may look unusual, but it makes a lot of sense (at least
to me :). To wit: if you want to know how many 4's are in 12, you
divide 12 by 4. If you want to know how many radians are in a, you
divide a by (one) radian. As long as the optimizer is smart enough
to recognize that 1.0 is a multiplicative identity, it keeps the
expression clear at no runtime cost.
Peter Schmitteckert wrote:
> The general rule is, that the result must not depend on the unit
choosen,
Exactly.
Kevin Lynch wrote:
> The main use I actually see for such functionality [undim] is to
improve
> efficiency inside calculations.
Ah, but the beauty of templates is that you don't have to make such
sacrifices for efficiency. A proper design  and a smart optimizer 
will generate the (fast) object code you really want from the (clear)
source code that you also really want. It remains to be proven that
we can actually achieve it, but certainly the goal is for the library
operations to be exactly as efficient (at runtime) as the underlying
fundamental types.
For interfacing with existing routines that expect plain numbers, you
can
either do the necessary conversion at the point of call, or better yet
define wrapper functions that do the conversion correctly every time.
k.hagan wrote:
> We could do without unit casts if multiplication by,
> for example, the following constant could be optimised away
> at compiletime.
> length_t const length( 1 * meter() );
> I haven't peered far enough into the proposal to see if a compiler
> has a reasonable chance of spotting this.
The idea is that by defining "meter" as an inline function rather
than as a variable, this can be optimized properly. I still need
to brush up on my assembler to verify that this actually happens
as intended.
Corwin Joy wrote:
> 2. I often have to cast and format out reports in different
> units of measure depending on how the end users
> prefer to view their results.
No problem, just divide by the unit you want to use.
 LINEAR VS AFFINE QUANTITIES 
Deane_Yang wrote:
> There are basically two types of units [linear and affine]
Yes! I was already aware of this but never had the right terminology
to express it. The physical quantity library deals only with linear
quantities. Does yours also do affine?
By the way, another good example is MFC's CTime (affine) vs.
CTimeSpan (linear) in MSVC. A bad example is time_t in posix,
which confuses the issue by using the same type for both
affine and linear quantities.
Deane_Yang wrote:
> I see implementations of affine units in various specific cases,
> like iterators, but I think it's worth identifying the common
> abstract notion that guides all of them.
Absolutely. I think your description of the difference would be
a valuable addition to any documentation.
 DIMENSIONLESS QUANTITIES 
Deane_Yang wrote:
> It is obvious that the ratio of two quantities with the same units
> must be unitless. What is not always so obvious is that some "units"
> are really unitless quanities. The two I know about are percentage
> (which is obviously a ratio) and radians.
Another one is steradians (solid angle). I've been struggling to
decide the best way to handle these. Although they are technically
dimensionless, it might be very convenient in practice to treat
plane_angle
and solid_angle as dimensions like mass and time. Then you could
handle radians and degrees exactly the same way as feet and meters,
and you wouldn't accidentally confuse plane angles with solid angles.
Quantities like revolutions per minute would have dimensions
plane_angle / time.
I think percentage could be handled more simply, but just defining
inline percent() { return 0.01; }
Then you could express 12.5% as "12.5 * percent()", which is analogous
to expressing 12.5 m as "12.5 * meter()". If you want to express a
number as percent, you can say "x / percent()", just like if you want to
express a mass in kilograms you can say "x / kilogram()".
 TRIG FUNCTIONS 
Deane_Yang wrote:
> Any standard transcendental function, such as sin, cos <snip>
> MUST take only unitless quantities and return unitless quantities.
> If not, there is an error in the formula.
Yes. The only exception would be if arc_angle is treated as a
dimension.
Kevin Lynch wrote:
> If you want to take the sin(x) where [x] = length, then you
> are almost always doing something wrong
<snip>
> If I had my way, such a library would specifically generate compile
time
> errors if you tried to apply a special function or logarithm to a
> quantity with dimensions; in such a case, the programmer is
undoubtedly
> "wrong" in what they are trying to do, and the library should provide
> you no assistance in trying to "fix" the problem. If you really
really
> think you need to, you should have to do so explicitly, it should be
> made ugly and horrible so as to indicate the seriousness of the breach
If I have my way, you'll get your way.
 POWERS AND ROOTS 
John Max Skaller wrote:
> Some operations seems to be missing.
> Suppose I write:
> pow(x,y)
> and x,y have units. What is the result?
Yes, a pow() function would be useful; I just hadn't gotten around to it
in the initial prototype. Perhaps it should actually be "pow< N >( pq
)",
where N is an integer and pq is a physical quantity. The exponent has
to be a template parameter because it affects the type of the result:
"pow< 2 >( some_length )" is an area but "pow< 3 >( some_length )" is a
volume. Similarly, "pow< 1 >( some_time )" is a frequency. I see no
obvious meaning for pow(x,y) if /both/ x and y have units, but if
someone else does I'm sure they'll let me know.
John Max Skaller wrote:
> A more pragmatic example is Pythagoras:
> sqrt(x * x + y * y)
Yes, sqrt( pq ) seems reasonable too, and Pythagoras is a great
motivating
example to include in the documentation.
John Max Skaller wrote:
> More generally, root(n,x) yields division
> of the dimension by n (and an error if it isn't
> integral  unless you want to support fractal dimensions :).
Right. (I assume you meant "fractional dimensions". This late at
night my mind can't bend far enough to encompass fractal dimensions.)
 EXP AND LOG FUNCTIONS 
John Max Skaller wrote:
> So log(n,x) also yields division. (Guess?)
Sorry, I lost you on that one. Do you want to take the base N log
of a physical quantity? The sqrt of an area I can handle, but
what's the logarithm of a kilogram?
Deane_Yang wrote:
> Any standard transcendental function, such as <snip> exp, log,
> MUST take only unitless quantities and return unitless quantities.
> If not, there is an error in the formula.
Yes.
Corwin Joy wrote:
> I'm not sure I agree with sin & cos, but I definitely *disagree* for
logs.
> It seems to me that the log of a unit has well defined units.
but Deane_Yang wrote:
> I've never seen exp() and log() used to compute areas like this.
> And I've never seen log(meters) used anywhere in physics or anywhere
> else.
<snip>
> There are lots of
> logarithms and exponentials used in discounting with interest rates
> and in option pricing models.
> If you look at it all carefully, you'll find that the arguments to exp
> () and log() are always unitless and so are the exp()'s and log()'s
> themselves. In fact, this is the setting where my unit
> classes have been most useful.
Ok, I can see Corwin's point that mathematically such things are
possible
and can be welldefined, but given the complexity that they would add
to any library, I'm inclined to agree with Deane and would want to see
compelling evidence that such things are needed in practice before
trying to add them.
Kevin Lynch wrote:
> However, I've never seen an expression with logarithmic units.
And with a little luck, I never will either.
John Max Skaller wrote:
> Sure you have: how about 100 decibels? :)
Deane_Yang wrote:
> There are logarithmic units such as decibels and pH. I'm not familiar
> with how these are used in formulas. I think these are sufficiently
> strange so that they should be treated specially.
Ugh. So much for my luck today. However, I'm not convinced that
a decibel is really a "unit" in the usual sense. It certainly
doesn't behave like a unit, since you cannot add or multiply them
the way you can with other units. Dividing a cubic meter by a
kilogram makes some kind of sense, but multiplying an ampere by
a decibel? Perhaps "decibel" should be its own class or function
or something, included in the library but completely separate
from the rest of it. The same for pH, earthquakes (Richter scale),
etc.
Bill Seymour wrote:
> I bring this up in connection with the SI Units presentation
> that the Boost group got in Copenhagen, in which units like
> "meter" are not dimensions, but multipliers;
This sounds interesting  since I wasn't in Copenhagen at the
time :) I'd love to hear about this presentation. Can anyone
provide a pointer to more information about it?
Bill Seymour wrote:
> and it occurs
> to me that units like dBm and dBk need to be addends rather
> than multipliers. Does the SI Units system being worked on
> have a way to deal with that?
Mine doesn't; I'm not sure yet about Fermilab's SIunits. It would be
nice if the library could help with dB, pH, etc. but I need to
think about this some more to see how it might work.
Toon Knapen wrote:
> In acoustics the multiplier (in your case 10) for the log
> function is '20' for some specific quantities.
> Also in acoustics one has dBA, dbB and dBC which are Db's
> that are filtered. The idea behind this is that one type
> of dBscale is more sensitive in the ranges heard by the human
> ear, whereas the other .... Taking these into account
> is even much more difficult as the filter needs to be known
> to the mechanism.
Maybe we need to make a distinction between "units", which
behave like mathematical units ("1"), and measurement scales,
which simply define a mapping between labels and quantities.
Every unit has an associated wellbehaved linear scale, but
some scales are weird and need special handling.
 FRACTIONAL EXPONENTS 
Kevin Lynch wrote:
> > (and an error if it isn't
> > integral  unless you want to support fractal dimensions :).
>
> This would be a mistake. Noninteger dimensions come up all over the
> place in physics (particularly astrophysics), although they are
usually
> rational fractional powers (3/2, 9/7, etc...). Typically, you have an
> expression that goes like alpha T^n, where n = 7/2 (or something), and
> the units of alpha are the same as T, but raised to some power like
> 5/2. Consider also an expression written as
>
> sqrt(x)*sqrt(y)
>
> where x and y have units of length. This type of thing should not
> fail. Yes, you could require the user to rewrite it, but consider
that
> such an expression might be embedded into some larger equation, and it
> is not immediately obvious that it needs to be rewritten. Or, as is
> often the case in numerical work, you have to torture an equation so
as
> to ensure that you don't incur overflow inside a calculation when the
> outermost result should be representable. This can lead to the same
> unit situation.
Now there's a whole new can of worms. Would it work to allow just
rational
dimensions? Then all the power and root stuff could be put into a
single function "pow< Num, Den >( pq )". Then
inline sqrt( pq ) { return pow< 1, 2 >( pq ); }
and
inline cube( pq ) { return pow< 3, 1 >( pq ); }
(Yes, that's a very inefficient way to implement cube; it's just an
example.)
Of course this would add a whole slew of new template parameters to the
types, but if the denominators are defaulted to 1 most people should
never notice. I hope.
 DYNAMIC CHECKING 
Corwin Joy wrote:
> I think that having a nice templated units library
> would be cool. One thing that seems to be missing
> from this discussion, however, is that the templates
> should have virtual base classes which expose operator
> + and * such that developers can
> use the units classes for either compile time checks or runtime
> checks.
I was actually planning to add an additional "any_quantity"
class (with conversions between that and the templatized
quanitities), whose dimensions would be represented and
checked at runtime. I.e. the set of all possible values of an
"any_quantity" would be the union of the set of all possible
values of all possible template instantiations of the
templatized "quantity" type. I agree that in certain situations
such a thing would be useful and even necessary.
However, I'm confused about your reference to virtual base classes.
Did you mean "base class with virtual methods"? Even then, I
think it would work better to keep the templatized version
completely separate to avoid the runtime space and time costs of
polymorphism.
Corwin Joy wrote:
> 1. I'm reading the values and units from a database and I
> don't always know what possible units are allowed in advance.
This brings up another whole aspect, which is dimensions and
units that are not even know at compile time. Allowing these
would make it awfully hard to meet strict efficiency constraints.
Corwin Joy wrote:
> 3. The unit conversion ratios for me will often change at runtime,
> potentially minutebyminute
> e.g. how much is a Mexican Peso worth in US $ today, (exchange rate)
This brings up /another/ whole aspect, which is nonconstant
conversion factors. I understand why this could be useful, but I
think allowing such things would significantly enlarge the scope of
the problem  and make it correspondingly harder to solve cleanly.
Deane_Yang wrote:
> I've thought about this, too. But allowing dynamic types to be defined
> at runtime makes things way too complicated for me.
If we define a single supertype that subsumes all the individual
templatized types, we can get the desired effect fairly simply.
Deane_Yang wrote:
> I concede that foreign currencies are a bit
> of a headache to implement using template unit classes.
I certainly won't argue with that.
 MISC REMARKS 
Deane_Yang wrote:
> I've been reading and learning a lot from the boost list.
Me too.
Kevin Lynch wrote:
> You wound me good sir! We physicists would never traffic in "kludges"
:)
Perish the thought! Software engineers never use kludges either;
we call them "workarounds" and "optimizations".
 FEATURE LIST 
There is now quite a shopping list of possible capabilities of a
dimensionsandunits library:
 full compiletime dimension and unit checking
 implementation imposes no runtime overhead
 include powers and roots
 parameterize numeric representation
 parameterize calibration
 include multiple models of the universe
 handle fractional dimensions
 handle logarithmic/exponential dimensions
 include radian and steradian as units
 include (and distinguish between) affine and linear measurements
 include logarithmic units like decibel
 fully functional under widelyused but noncompliant compilers like
MSVC
 dynamic  dimensions of a value determined at runtime, e.g. for input
 dynamic  set of possible dimensions determined at runtime
 dynamic  conversion factors vary at runtime, e.g. currencies
 track accuracy (the "plusorminus" amount) through calculations
 check for quantities that are different but have the same dimensions
On the one hand, it's not at all clear that all this stuff should be
included in a boost library, lest it suffer from
"kitchensinksyndrome".
A boost library does not have to be all things to all programmers.
On the other hand, I'm inclined to believe that those who would actually
use the library (who I originally believed to be physicists and
engineers,
but apparently currency traders want a piece of this action too) know
more
about what they need than I do, and I'd give great weight to their
opinions.
I do believe that three subtly different philosophies of what a
"quantity"
library could do are emerging:
1) Dimensionchecking. Convert everything to powers of the seven
fundamental SI units (plus perhaps radians and steradians) and make
sure all calculations are dimensionally consistent.
This won't catch all errors, but it's pretty simple to describe
and verify, and it will catch enough problems to be useful. It's
what I originally had in mind, although of course it's turning
out not to be /quite/ as simple as it first appeared.
2) Typechecking. Keep track of what is actually being measured,
including distinguishing types that are conceptually different but have
the same SI dimensions (e.g. frequency/hertz and activity/becquerel,
or absorbed dose/gray and dose equivalent/sievert. Radians and
steradians might actually fall under this category, too).
This is harder, since somehow you have to define what conversions
are acceptable, and you have to deal with all the types that are
generated implicitly inside expressions. At this point I'm not
sure how to even define the requirements precisely.
3) Unitchecking. Keep track of the actual units, like yen and
euros, so everything is totally dynamic. This is way beyond
where I want to go right now, and it may well be mutually
exclusive with the efficiency constraints wanted/needed by the
astrophysicists and the "wiresandpliers" crowd.
in a completely different context David Abrahams wrote:
> be careful not to overdesign this library. Simpler is usually better.
Yes. My inclination at this point is to (try to) strike a balance
between
simplicity and completeness by getting a fairly flexible but relatively
simple, straightforward library (by either boostifying existing code
or writing new code) that will be useful and efficient for the majority
of common scientific and engineering calculations, and then  maybe 
go
back and look at the more arcane stuff.
  Michael Kenniston mkenniston_at_[hidden]
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk