Boost logo

Boost :

From: Andy Little (andy_at_[hidden])
Date: 2006-06-10 12:58:35


Hi Janek,

"Janek Kozicki" wrote
> Hello,
>
> I'm posting this review a bit after the deadline, and I'm very sorry about
> that. This is caused by the fact that I didn't know that all mailing list
> members can write a review. When I learned about that, I had only about 2 days
> left, but still I wanted to perform a deep analysis of what is in pqs.
>
> ---------------------------------------------------
> * Do you think the library should be accepted as a Boost library?
>
> Yes, I think it should. The scientific community, and industry are in
> desperate
> need of such library, rejecting it just because it's not perfect would be a
> very bad decision. It's not uncommon that boost libraries evolve once
> included,
> I believe that this will be the case here. Moreover including it will greatly
> increase the amount of feedback received and in result the library will reach
> the "perfect" state much faster. After performing checks (described below) I'm
> sure that the library is ready for inclusion.

Great. Thanks for the yes vote.

> Note: below I often refer to yade ( http://yade.berlios.de/ ) which is a
> framework for performing scientific simulations. Yade is my current project,
> and a part of my PhD.

Yes, I know yade or at least watched some sample videos. Its a great project

 I have done tests of using pqs inside yade, performed
> speed benchmarks and unit tests.

OK. I havent done anything but very minor performance tests on pqs, so
that will be interesting.

> ---------------------------------------------------
> * What is your evaluation of the documentation?
>
> Before trying pqs I've read almost whole documentation (skipped few pages).
> So those comments and suggestions below were written before I actually tried
> using pqs:
>
>
> As stated by other reviewers the introduction is a bit offtopic. Summary
> given
> by review manager fits here best. In "Definition of terms" an UML diagram
> would
> be nice, or at least some of those diagrams that are automatically generated
> by
> doxygen. Later I discovered that in html version there is a diagram, but not
> in
> pdf version. I wonder if any other reviewer would complain less if he saw this
> diagram in pdf. Downloading pdf was much easier than unpacking the library and
> opening index.html from it ;)

Yes it was a bit too easy!

> I'd prefer a different name for coherent_unit and incoherent_unit, for
> example
> si_unit and non_si_unit.

OK.

> Several terms ( 'dimensional analysis', 'dimensionally_equivalent',
> 'dimensional_math',
> 'dimensionless', 'dimensionful', 'physical_quantity',
> 'physical_quantity_system',
> 'rational_number', 'unit_output_symbol', 'unit_symbol' )
> do not appear on the Entity Relationship Diagram. Therefore their
> meaning is for me unclear: classes, methods or concepts? If they are concepts
> then underscore ("_") shouldn't be used in their name. Method names can be
> written in the ERD diagram inside rectangle of a given class, separated by
> horizontal line (UML notation), so that it would be possible to look up
> methods
> in the diagram.

OK. I need to study UML for diagrams

> Also a reference-style listing of all classes and methods in the library
> would
> be nice (this kind of documentation is usually auto generated by doxygen).
> This
> listing would have references to correct places in synopsis. It would be also
> *very* useful if at the end of the doc, there was a referenced list of all
> examples in the manual, and also a list of all pictures.

OK.

> In Rationale:Clarity section the example with PQ_Box should be written full
> 'ready to compile' (because it is the first longer example that appears in the
> manual).

OK.

> Also IMHO Rationale section should be put before Definition of terms.
> Usually
> definition of terms is put at the beginning, but usually it is much shorter. A
> causal reader who doesn't know what he should read, will read from top to
> bottom - and in this respect it's more important that he reads Rationale
> first.
> Because if he reads Definitions first - he may lose his enthusiasm due to fact
> that definitions are _always_ boring (no matter how hard you try to make them
> interesting).

IMO the optimal layout for documentation is a hierachical (or star) structure
rather than a serial one. In using Quickbook I traded convenience for
flexibility. OTOH I guess the serial style works for pdf's.

> I prefer "per": m_per_s, etc...

OK. I think the majority do too.

> In pdf page 3, missing space: "The keyword typeofis used".
>
> In pdf page 5, missing space: "distinguish between different but
> dimensionally_equivalentabstract_quantities."

QuickBook 1.1? is to blame for that one AFAICS by concatenating adjacent links.
(Since then I have discovered how to add explicit spaces)

> In pdf page 18 there is a reference to "Example 3a", but all examples are
> unlabelled, so I can only detect from the context of the sentence which
> example
> is actually referred. It would be useful to label all examples. (and make a
> full
> list of all examples at the end).

That shows the vintage of the example. Its copy pasted from Example 3a in the
pqs 2 series docs.
Yes. All examples should be numbered. In fact stuff such as sections,
paragraphs, diagrams etc should be numbered.

> In pdf page 23 comment is too long to fit on the page.
>
> In pdf page 28 - mistaken underline
>
> In pdf page 32 - I prefer synopsis written in valid C++, for example simple
> copy'n'paste from file t1_quantity.hpp would work. In fact I had difficulty in
> understanding line "if IsDimensionless<abstract_quantity>", and lines
> following
> it, until I guessed that it's some kind of Cpp pseudocode.

I could make it clear and maybe try to follow conventions too. Maybe I should
make like GIL:

http://opensource.adobe.com/structPoint2DConcept.html

> In pdf page 51 - mistaken underline.

I think that might be an problem in the Quickbook --> pdf chain.

> Word "useage" hurts my eyes. Heh, and vim's spellchecker underlines it as a
> bad word ;)

OK. bad spelling.

> ---------------------------------------------------
> * What is your evaluation of the implementation?
>
> More constants, including mathematical ones, would be nice.

The selection included is just for demonstration. All the constants in SP961
should be included:

http://physics.nist.gov/cuu/pdf/chart1.pdf

> I don't like the separation of "three_d" and "two_d" - where quaternions
> should
> be put then - "four_d" ?

My rationale is that quaternions exist in a 3D
space. OTOH complex would be in two_d because it exists in a 2D space.

> I've gone through similar problems in my own design
> (just browse older revisions of yade in subversion repository
> http://tinyurl.com/lgk8t ), and finally I've found it most convenient to put
> vector (both 2d and 3d), quaternion and matrix inside "math". It was simply
> too
> cumbersome to type everywhere all those lenghty namespace names (and I used
> various variants, including 2d / 3d distinction). Therefore I strongly suggest
> to use simply namespace "math" and put inside vector2d, vector3d, quaternion
> and matrix. Extra namespaces increase fragmentation.

OK.

> Also please kindly consider following exampls, what name is more
> descriptive and shorter?
>
> three_d::vect
> math::vector3d
>
> OK, my favourite name is one letter longer, but still - "math" says a lot
> more that "three_d", so this name is more descriptive, IMHO.

Maybe even lose math:: ?

> Please be verbose, please rename vect to vector3d and vector2d, when
> putting them into
> single namespace. Also I prefer name quaternion, over quat.

OK.

> I'm sorry about elaborating so much about this topic - that's because it's
> close to my heart. In fact I can adapt to any library layout which Andy
> choses,
> but what others think?

IMO it makes sense to put 3D entities in a 3D space and 2D entities in a 2D
space, but its not critical to me either way

> I believe that in matrices there is no need to support both row-column and
> column-row notation explicitly, but rather to keep internal compatibility
> thorough whole library. Based on that I'd strongly suggest to rename
> rc_matrix
> into matrix, and never implement cr_matrix. There is an interesting reasoning
> about
> this problem inside panda3d software manual (given time I'll find it later).

Its certainly less work sticking to one type of matrix so I dont object to that.
There may be performance resaons to choose on or other type?

> It's nice to see timer included. (but missing from docs?)

It was a bit of fun for intended for simple performance timing, but should be
documented yes.

> It would be nice if t1_quantity was renamed to some descriptive word.
> Perhaps a discussion on the mailing list would help to discover a better word.

Perhaps just quantity1, quantity2, quantity3 or quantitya, quantityb, quantityc.

> I did not delve as much as I would want into the implementation of
> t1_quantity,
> due to lack of time. I noticed that there are two similar files, length.hpp
> and
> length_.hpp, but there is no force_.hpp, or any other file with _ at the end.
> This puzzles me a bit.

Oops...length_.hpp should have been removed from the package.

> ---------------------------------------------------
> * What is your evaluation of the design?
>
> The design as shown on the Entity Relationship Diagram (
> /libs/pqs/doc/html/pqs/pqs_erm.html )
> is so clear, that I can hardly imagine making a better design. It must be
> noted
> however that I've never written my own units library.

I need to redo it usin the correct symbols maybe and see if it can be used for
other systems than SI or not.

> I'm not sure if fixing number of dimensions to exactly 7 is good decision,
> perhaps it could be changed at later date, after the library is included to
> boost.
> Those extra dimensions would be non-si of course, but could be useful for some
> people (storage capacity of a harddrive comes to mind, or unit of information
> measured with libraries of congress ;).

I think the number of dimensions could be configurable. This is dependent on PQS
becoming superceded too though, because PQS is only useful in SI IMO.

> It's good that Andy plans to implement vector, matrix and quaternions
> operations, that will perform necessary unit checking during computation (all
> at compile time!). My concern however is that Finite Element Method uses large
> matrices (for example with 1000 columns), and I'm sure that users will prefer
> to stick with ublas and all functionality provided by libraries they are
> already using. Instead of more general (and perhaps impossible) solution that
> would allow use of pqs with any other math library, I recommend to investigate
> tight integration with boost::numeric::ublas. I'm sure that people responsible
> for that library will gladly accept the idea, and accept any patches related
> to
> this. Basically this will make boost::ublas a ,,sort of'' part of pqs.
> Other solution would be to reimplement such huge matrices in pqs, but how
> far
> Andy could go?

I dont know if its possible. Another option is to cast the quantities to
numerics. IOW exit the type checking. I will have to ask on the ublas list.

> ---------------------------------------------------
> * With what compiler did you try the library?
>
> gcc version 3.3.5 (Debian 1:3.3.5-13)
>
> I have compared compilation times with and without pqs. Whole yade consists
> of over 20000 lines of C++ and it compiles in 10minutes on my AMD 4400+ (when
> I'm too lazy to use distcc which can reduce this time to 3 minutes - ideal
> when
> I'm in hurry). Because pqs is just in one small module (about 100 lines of
> code) I measured the time for compiling only this module. I took 5 measures,
> and computed average:
>
> without pqs: 3.8 seconds
> with pqs : 4.8 seconds
>
> That gives about 21% increase. It would mean that if I used pqs in whole
> yade 10 minutes could possibly become 12 (or 4 minutes with distcc). not much
> difference I think. Full recompiles do not happen often, anyways.

It is interesting to see the comparison. I spent some time trying to keep
compile times down.

> ---------------------------------------------------
> * Did you try to use the library? Did you have any problems?
>
> First I've compiled some of the examples without any problems at all.
> When working with the first example I discovered that it would be nice if
> include path for headers was shorter, like following:
>
> #include <boost/units/length.hpp> // to get length
> #include <boost/units/io/length.hpp> // or sth similar to get IO for length.
> (input.hpp is OK too)

I think that is better than the current arrangement.

> When working with the examples I decided to add quaternions to pqs, because
> in yade quaternion calculations play a crucial role. I used three_d::vect as a
> base source file. I was really pleasently surprised how easy adding them
> turned
> out to be! I was afraid that some unexpected obstacles will stop me on my way
> (which will cause this review to be send after the deadline of 9.june (oops
> it's after the deadline now!)). I've sent this quaternion implementation to
> Andy privately, I hope that it will be included.

Yes Thanks! I am currently digesting it!

> It lacks many commonly used
> methods, but the base foundation for it is prepared, which I consider a
> success
> of pqs design. I should stress again, that I've virtually done nothing in
> effort that those quaternions will support units - those units were supported
> right away. Only thing I didn't know (but didn't consider important - as I
> spent less than 2h to complete the task) - was how to define a return type
> of squared unit (used by method squared_magnitude() ).

This currently just uses the binary_operation function as elsewhere, but may be
simpler with Boost Typeof

> Then I started to adapt yade to pqs. I decided to use units only in one
> module - that one where the actual calculation loop for selected simulation
> resides (the most crucial part of whole simulation). Yade is capable of
> modelling various models, so I picked the most causal one - simple DEM
> simulation: spheres falling onto flat plane below: http://tinyurl.com/hjy9g .
> Pqs was used only in simulation loop, so I expected extra slowdown due to
> converting data to/from physical quantities at the start/end of each loop.
> However it was simpler than converting all the data structures in yade. During
> the process I needed a unit to express a spring stiffness (hooke's law F=kx,
> where k is [N/m] and x is [m]). But because Fred Bertsch has just now posted
> an
> email that "review ends" I decided not to add a new unit N_div_m, but rather I
> used Pa that was multiplied during calculations by 1 meter constant (so maybe
> adding new units should be made simpler). During the process of adapting yade
> to use pqs I was impressed in overall how easy it did come by. As an
> indicator:
> on the first time when my modified code compiled without errors it was running
> correctly. I was really surprised, and this shows the power of units.

That is good to hear.

> Especially when I recall how much time I've spent more than year ago to debug
> this code! This gives me a clear picture of how much time can be saved with
> pqs
> library. How often your code, dear reader, ran correctly after the first
> compile? ;)

If it could be achieved with no drop in performance then it will be near
perfect!

> First thing I did after sucessfully compiling was to perform a unit test
> simulation and compare the simulation state. I did this by running the
> simulation for 1000 iterations (all the spheres dropped down to the flat plane
> below and formed a pile), and saving spheres coordinates and their orientation
> every 50th iteration. I used type double for whole calculation, and
> boost::lexical_cast<std::string>(double) to save coordinates with 16 decimal
> places.
>
> So I had 20 files from each run, which I could compare between pqs and
> non-pqs
> version. Comparing them showed that in 100th iteration about 20 spheres (out
> of
> 1000) had coordinates differing on the last decimal place. In 150th iteration
> all spheres' coordinates differed on average on last 2 decimal places. In
> 200th
> it was 3 last decimal places. The difference was growing during time. In 500th
> it was last 8 decimal places. Due to the chaotic nature of this simulation,
> every error will accumulate and grow. So I can conclude that in the simulation
> loop with pqs there appeared a small chance of having different last digit
> than
> expected. And this error accumulated over time. I'm not sure if this error is
> caused by pqs, or maybe by extra assignments and conversions. Personally I
> think that it's because the order in which components of the equation were
> added and multiplied was changed.

OK. I don't know what is causing this, except that pqs changes division and
multiplication in some cases. I also use Boost Numeric converter. I guess I will
need to look into the actual code to determine this. I had better admit that the
implementation is not very pretty and should be totally reworked.

> After performing above checks (aka. unit test) I did speed test, to
> benchmark
> pqs. Results are not in favor of pqs, but then I should remind that they are
> skewed due to extra constructor calls in pqs version. Those constructor calls
> would not be here if I converted whole yade's data structure to pqs. So
> perhaps
> I should not make those results public? Since I have no time now to consider
> whether
> to make them public or not, here they are:
>
> without pqs (three runs):
> Computation time: 370.939 seconds
> Computation time: 382.615 seconds
> Computation time: 360.418 seconds
>
> with pqs - and extra constructor calls, which would be avoided if I
> converted whole data structure! (three runs):
> Computation time: 415.724 seconds
> Computation time: 423.181 seconds
> Computation time: 421.799 seconds
>
> that makes a difference of roughly 11% - 13%

OK. Bear in mind that I have spent no time in optimising PQS for performance.
The use of pqs for dimensional analysis checking and best performance may need
to consist in writing code using typedef defined so that quantities/ floating
point types are swapped in to do dimensional analysis then swapped out in favour
of floats in finished code. I have written some code this way. Within certain
rules ( no operations that scales a value between units for example, which
could be set up to flag an error) it should be the most effective method.

All these useages are possible including the default style, the SI_units style
preferred by Jesper Schmidt and the style described above...
 using another 'view'/ frontend or call it what you will on the underlying type.

> Previously when working with this code I've seen a speed difference of ~2%
> just
> because of one extra constructor call in this loop. While I performed this
> benchmark with good intentions I still cannot say whether pqs is slower or
> not.
> This question will be answered when I fully convert yade to pqs. It's
> possible
> that I will discover something new with kcachegrind profiler - but due to the
> lack of time (review deadline) I cannot post profiler results now. If anybody
> wants to see the code I used it's here: http://tinyurl.com/r83tr , while the
> original code is here: http://tinyurl.com/ouhbc . Quick glance at the code
> shows 21 extra assignments of double, and 9 extra construtcor calls.
>
> Also I should note here, that during my work with yade, every time I made a
> design decision which favored clean design over calculation speed I was later
> rewarded. The reward was, that clean design allowed unexpected optimizations
> in
> different pieces of code, which at the end resulted in faster calculation
> speed. I have reasons to believe that this will be also the case with pqs.

Yes. High performance hasnt had much work done on it. I suspect that some of my
attempts at code enhancement might have caused slow
downs. I should test different implementations, which I havent done.

> During this adaptation of yade to pqs I encountered only one small
> obstacle:
> operator *= for vect was missing. I had to write { vect1 = vect1 *
> some_scalar;};
>
> Also vect has a function magnitude(), please add squared_magnitude(). I
> tried to
> add it myself, but got lost in defining a squared return type.

OK. vect is quite new and unfinished, needs a lot of work and tests of course.

> ---------------------------------------------------
> * How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
>
> As much as a human can, given just 48 hours. Sounds a bit funny - that's
> because earlier I was not aware that any boost mailing list member can perform
> a review. So maybe it qualifies even as an in-depth study, but I'm not sure.

Maybe. Its a very impressive review all round .

> ---------------------------------------------------
> * Are you knowledgeable about the problem domain?
>
> Instead of claiming good (or bad knowledge) I should rather describe my
> background, which should let you decide whether I'm knowledgeable:
>
> Physics : I'm working at a technical university, doing a PhD which focuses
> on modelling behavior of concrete. I'm familiar with Finite Element Method,
> with Discrete approach, with various material models (viscoelastic,
> hypoplastic, cohesive, etc). I've had experience with modelling granular
> materials and cloth. Currently I'm developing (mentioned earlier) a framework
> for performing physical simulatios - http://yade.berlios.de/ . I have 16
> english-language publications (9 of them are just conference materials, 6 of
> them are in scientific journals, and 1 is in renowed scientific journal
> "Granular Matter"). No more details, as I feel now like boasting ;)
>
> C++ : I have over 10 years experience with programming in C++ (mostly
> guided by my brother Bronek who is more active on this mailing list ;). As my
> achievements in this field I'd like to list serialization library in yade.
> After thorough testing we decided that boost::serialization does not qualify
> for our needs - we needed not-on-the-fly serialization (important design
> decision), and better support for various formats - xml easier to read by
> human, binary, text, etc. We even developed a backend which allows to
> serialize classes (their registered components) into QT dialog window (well,
> currently unfinished, as containers are left out, but soon this will be
> fixed).
> Another would be a multimethods library in yade, I made it as a "dreaded
> excercise left for the reader" - those words Andrei Alexandrescu used at the
> end of his chapter about multimethods in "Modern C++ Design". In this
> implementation recursive templates allow theoretically unlimited number of
> dimensions (at this moment only 3 are used), and up to 15 arguments in
> multimethod call. This library focuses on speed, so a callback matrix is used.
> By looking at this source code (in subversion repository of yade
> http://tinyurl.com/hxbr7 ) you can judge my familiarity with C++. (I'm more
> proud of multimethods in terms of template programming, so look there first,
> serialization is still waiting for some cleaning ;)
>
>
> ---------------------------------------------------
> * What is your evaluation of the potential usefulness of the library?
>
> For people working in science and engineering this library is extremely
> useful. There is a bad need for such library in the scientific community.
> I had a chance to envision myself how pqs reduces the chance of making stupid
> mistake in calculations. Once my code compiled without errors it was correct!
> If
> I recall now how much time I've spent debugging this code more than year ago,
> I
> see that pqs allows to save huge amounts of time.

I am glad to hear it. Now all thats needed is the original performance.

> ---------------------------------------------------
> * Other random comment:
>
> I feel there is a need for another discussions how to call this library in
> boost. Most people do not like "pqs", some people (including me) prefer name
> "units", but others have strong and reasonable arguments against name "units"
> (but I'm not sure if anything else was proposed).

It might be good to see what emerges out of the review. Maybe an enhanced units
library. If so pqs might keep its name for reference/ comparison with the new
library.

Thanks for the very comprehensive review.

Oh and the vote too!

regards
Andy Little


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk