Boost logo

Boost :

From: Michael Stevens (Michael.Stevens_at_[hidden])
Date: 2004-07-05 11:09:05


On Saturday 03 July 2004 16:54,David Abrahams <dave_at_[hidden]>

> Jeremy and I have just completed a re-evaluation of uBlas based on
> what's in uBlas' own CVS repository, having not discovered that until
> recently (you should have that info on uBlas' Boost page!)  We have
> some major complaints with uBlas' design.  The list is long, and the
> issues run deep enough that we don't believe that uBlas is a suitable
> foundation for the work we want to do.

This is very interesting. Thanks for writing this up and posting. I can't
address all the issues. However those which I can I will endeavor to give you
and the Boost community my feedback on.

First of all I am currently making sure Boost is updated with the latest uBLAS
development version. This already has updated documentation which points to
all the correct locations for discussions, CVS and Wiki. This should make it
much easier for users to find the correct locations if the stable version in
Boost does not meet their needs.

To facilitate the development of solutions I will place many of the points
into the uBLAS bug tracker on SF.

> Interface Design
> ----------------
>
> * Not grounded in Generic Programming.  The concept taxonomies, to the
>   extent they exist, are weak, and poorly/incorrectly documented.
>   Aspects of the design that should be generic are not (e.g. only
>   certain storage containers are supported, rather than supplying
>   storage concept requirements).  No linear algebra concepts (vector
>   space, field, etc.)  The library is so out-of-conformance with our
>   expectations for Generic Programming that this one item by itself
>   is probably enough to make it unsuitable for us.

This is a big issue. I believe we need to split the documentation issues from
the design issues. Documentation is high on the priority list of many uBLAS
users and developers. Most new users start with a requirement for a C++
matrix and BLAS library with nice syntax, efficiency and documentation! The
need for clearly defined and documented concepts comes for most people a
little latter when extensibility becomes an issue.

The storage concept is not as bad as it appears. The concept although not
documented is sufficiently well defined that creating a new model is
unproblematic. At present the concepts are documented by their models :-)

> No linear algebra concepts (vector space, field, etc.)
I not sure if these concepts really need to be defined. They are already well
described in many mathematical text books and in Wikipedia. Almost all the
major reference works on BLAS and Matrices simply state elements are either
Real or Complex and are therefore implicitly a field.
Of course a C++ linear algebra library should be generic enough to work with
any type which is a numeric approximation to a field.

>
> * Redundant specification of element type in matrix/vector storage.
Not sure what is meant here, can you elaborate ?

> * size1 and size2 should be named num_rows and num_columns or
>   something memnonic
>
> * iterator1 and iterator2 should be named column_iterator and
>   row_iterator or something memnonic.
I think many uBLAS users have come to appreciate the brevity of naming. I
think the concensus in the Boost review was as long as they were used
consistenty these names are acceptable.
 
> * prod should be named operator*; this is a linear algebra library
>   after all.
Oh, there has been much ink spent on this! Joerg argues against operator* for
two reasons I think. a) operator* brings an expectation of commutativity and
this is not the case in linear algebra
b) The are many products (inner,outer,element) and there should be a clarity
of naming.
 
> * begin() and end() should never violate O(1) complexity expectations.
>
> * insert(i,x) and erase(i) names used inconsistently with standard library.
Agreed.
 
> * Matrix/Vector concept/class interfaces are way too "fat" and need to
>   be minimized (e.g. rbegin/rend *member* functions should be
>   eliminated).
Interesting. Certainly the fatness is a worry. I think rbegin/rend need to
stay as members as this is part of the Reversible Container concept.
 
> * The slice interface is wrong; stride should come last and be
>   optional; 2nd argument should be end and not size; then a separate
>   range interface could be eliminated.
For std::slice (valarray) compatibility stride should be last. Not sure why
this was not picked up on a long time ago!!
valarray (and uBLAS) use size (not end) for good reason however.
I also think it would be a mistake to drop range. slices can be degenerate
(zero stride) and this has a runtime cost which range does not need to share.

> * No support for unorderd sparse formats -- it can't be made to fit
>   into the uBlas framework.
Interesting. I guess this requires some thinking time for a sensible
discussion.

 
> Implementation
> --------------
>
> * Expressions that require temporaries are not supported by uBLAS
>   under release mode. They are supported under debug mode.  For
>   example, the following program compiles under debug mode, but not
>   under release mode.
>
>       std::cout << prod (prod(v, m), m) << std::endl;
>   }
>
>   The workaround to make it compile under release mode is to
>   explicitly insert the creation of a temporary:
>
>       std::cout << prod (vector<double>(prod(v, m)), m) << std::endl;
>
>   There should be no such surprises when moving from debug to
>   release.  Debug mode should use expression templates, too, as the
>   differences can cause other surprises.

This was changed by Joerg after the last stable release to Boost. I will check
there has been no regression.
The fix propagates an expression complexity that is statically checked. This
check explictly prevents the creation of nested products!! Therefore the
correct code is now that with the explicit temporary or with one of the
parameterized prod's.
The change is aimed at the user. They must clearly express the evaluation and
storage complexity of the expressions.

The change of semantics between debug/release builds is unfortunate. The
partial disabling of the expression templates in debug mode is currently the
default and a config macro. With current compiler technology this prevents
debuggers crashing and users scrolling through type names that are pages
long. In a perfect environment the default would be the opposite.

> * Should use iterator_adaptor. There is a ton of boilerplate iterator
>   code in the uBLAS that needs to be deleted.
Agreed 100%. A well used iterator_adaptor would shorten uBLAS definitions to
about 30% of their current size !!

> * Should use enable_if instead of CRTP to implement operators. uBLAS
>   avoids the ambiguity problem by only using operator* for
>   vector-scalar, matrix-scalar ops, but that's only a partial
>   solution.  Its expressions can't interact with objects from other
>   libraries (e.g. multi-array) because they require the intrusive CRTP
>   base class.
Interesting solution. Agreed, the current implementation is closed to other
objects because of the intrusive base class. Sadly uBLAS was designed before
enable_if. Any pointers to compile/runtime effects of the enable_if solution?

We would however lose the static type checking that argument matching with the
CRTP base class gives us. At this point I think we land in a deep
philosophical argument as to the strengths and weaknesses of generic
progreamming %-)

> * Certain operations, especially on sparse matrices and vectors, and
>   when dealing with matrix_row and matrix_column proxies have the
>   wrong complexity.  Functions such as begin() and end() are suppose
>   to be constant time. I see calls to find1 and find2, which look like
>   expensive functions (they each contain a loop).
Agreed. The implementation with findN is simple but expensive. I am not sure
why Joerg choose this route.

> Testing
> -------
> * There should be a readme describing the organization of the tests.
>
> * Tests should not print stuff that must be manually inspected for
>   correctness.
>
> * Test programs should instead either complete successfully
>   (with exit code 0) or not (and print why it failed).
Agreed. If anyone submits a patch it would be included very quickly!

> * Abstraction penalty tests need to compare the library with tuned
>   fortran BLAS and ATLAS, not naive 'C' implementation.
There was some very good work on this about a year ago. Sadly the results are
no longer accessable on the web. Since the loss to Intel of the KAI compiler
it is hard to see any library archiving really exceptional results in this
regard.

> Documentation
> -------------
>
> * In really bad shape.  Redundant boilerplate tables make the head
>   spin rather than providing useful information.
>
> * Needs to be user-centered, not implementation centered.
>
> * Need a good set of tutorials.
Agreed.

> Compilers
> ---------
>
> * Simple example doesn't compile with gcc 3.3. Got it to compile by
>   #if'ing out operator value_type() in vector_expression.hpp.
Strange gcc-3.3.3 is my primary compiler. Can you specify the problem more
exactly. Which gcc-3.3 subversion is causing the problem?

Thanks,
        Michael
___________________________________
Michael Stevens Systems Engineering

Navigation Systems, Estimation and
                 Bayesian Filtering
    http://bayesclasses.sf.net
___________________________________


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk