Boost logo

Ublas :

Subject: Re: [ublas] Patch/proposal for overloading operator * in ublas
From: Jesse Perla (jesseperla_at_[hidden])
Date: 2009-08-18 21:58:03


On Tue, Aug 18, 2009 at 11:12 AM, Rutger ter Borg <rutger_at_[hidden]>wrote:

> Karl Meerbergen wrote:
>
> after all these discussions... what about if I would start stopping
> reinventing the wheel, and use GLAS as a high-level "engine" for the
> bindings? I see BLAS/LAPACK support is there but not as complete as it
> could
> be. Could you bring me up to speed w.r.t. the expression engine of GLAS?
>

(The following is a plea to extend the existing ublas. I am trying to focus
on economics research and keep finding myself implementing basics... I
imagine many people doing scientific computing are in the same boat).

GLAS would be great, and is a possible front-end to UBLAS in the future.
But it isn't deployed or in boost yet.

The discussion on boost developer and here make it pretty clear that an
immediate, temporary fix would be extremely useful (and hopefully help focus
the efforts of other people on extending instead of replacing). I really
think that a little bit of effort on Matlab'ing UBLAS could go a long way.
Then the longer term approach can be evaluated.

As much as I love the bindings, some of these might be of higher importance
to pound through. It seems that the ublas interface hasn't changed much
since 2002ish, so it probably is time.

I have spent a lot of time using ublas in the last year and trying to get
Matlab programmers to use it. From my experience, the following is a list
of the things that would really help people out in order of importance (I
have tried to remove all features dependent on external libraries).

*1. Operator **

   - Operator * semantics relatively compatible with MATLAB. Including
   inner/outer product and nested products.

*2. Data Adaptors for C/Fortran*

   - There is code floating around for adaptors, but we really need to have
   a formal way of adapting a C or Fortran vectors for ublas operations
   (read/write).
   - Also, we need to have a better way to be able to pass the underlying
   UBLAS data to C or Fortran read/write routines. I know all of the
   difficulties with this, but it is essential for practical use. There is
   just too much Fortran code out there. I keep passing in:
   mymat.data().begin() to just get a double* and hope for the best. I know
   other people are in this boat.
   - For data updating operations, it is perfectly reasonable to force
   people to create an adaptor to get a pointer to the underlying data (which
   may copy the data to a temporary if the storage ordering isn't what people
   thought it would be).
   - I would guess that these exact same problems keep creeping up in the
   bindings, especially for the higher level ones. Maybe the code to do this
   is already built somewhere.

*3. Some temporary solution to the assignment of fixed matrices*.

   - This has been a huge stumbling block for people to use the library...
   much more than I would have guessed. Perhaps something like boost::assign
   could be used until C++0X uniform initialization
   - We should assume that everything is added in row major (and that the
   assignment would take care of the proper storage orderings)

using namespace boost::assign;
matrix<double> A(2,2);
A += 1, 2,
        2, 1;

   - I believe that the += and , overloads for this could work as the ,
   might build a vector, then the += could do assignment to the underlying
   data. It might be reasonable to only allow the assignment for certain types
   of matrices and all of them preallocated. I assume this is possible since
   Blitz does it.

*4. Matrix/Vector Stacking
*

   - One of the things matlab is very good at is stacking matrices and
   vectors.
   - e.g. Y = [A B]; or Y = [A; B]; in matlab to stack horizontally or
   vertically.
   - This pattern comes out all over the place in algorithms, and would make
   an enormous difference in legibility of code if it were implemented
   efficiently.
   - I imagine that the results would be a new matrix and vector expression
   that keeps track of stacking. It would need to implement the () and [] to
   go to the appropriate one it is storing... but with the expression
   templates, I can see this being very efficient.
   - What should the semantics be? It could overload an operator like , |
   and && but probably better to just have functions.

e.g.
matrix<double> A, B;
vector<double> a, b;

matrix<double> C = stack_vertical(A, B);
auto Cp = stack_vertical(A, B); //This is lazy. The Cp expression has done
no copying.
matrix<double> C = stack_horizontal(a, b);

//or with overloaded operators, something like:
matrix<double C = A & B; //Horizontal
matrix<double C = A & B |
                            B & A; //Stacks like [A B; B A] in matlab.

//or could have some kinds of tags to denote newlines:
matrix<double> C = stack(A, B);
auto Cp = stack(A, matrix::newline, B); //This is lazy. The Cp expression
has done no copying.
matrix<double> C= stack(A, matrix::newline, B); //This is lazy. The Cp
expression has done no copying.

*Interoperability with boost::multi_array*

   - I haven't heard many people talk about this, but it is crucial for a
   huge number of applications.
   - Matlab allows you to get 1 or 2D slices from their multidimensional
   array and then use normal matrix operations on it.
   - We need the ability to adapt boost::multi_array slices/references/etc.
   as ublas matrix and vector types with read/write operations (It is critical
   that this allows modifiable references).
   - This shouldn't be too hard since the multi_array manages all of the
   strides/etc. and has very well thought through iterators.
   - It would be amazing if there was a cast from the multi_array<double, 2>
   reference type to whatever matrix_expression of doubles we build, and
   multi_array<double, 1> to vector_expression.
   - This would be a HUGE value for a lot of work and showcase
   interoperability of boost, and suggest to people they shouldn't try to
   reinvent the wheel here
   - What would I like to be able to do:
   -

boost::multi_array<double, 3> covariances(boost::extents[2[[2][10]); //10
covariance matrices over time.

//Fill in these covariances over time...
boost::matrix<double> A, B;
....
covariances[boost::indices[range(0,2)][range(0,2)][0] ] = A * B; //Matrix
multiplication and assign to 0'th time period...

//And
boost::matrix<double> C;
cout << 2 * C * covariances[1]; //This gets the 1'th slice along the
multi_array....

*6. A couple of simple linear algebra operations*

   - While in general it is a bindings issue, many people want a couple of
   very basic linear algebra operations out of the box. It is perfectly
   reasonable to provide a few of them with decent semantics, even if they
   aren't in BLAS>
   - For the most part, I believe that there are existing ublas only
   implementations of these. In particular:
      - inverse A = inv(B). Various implementations are out there using the
      LU decomposition, etc. in UBLAS
      - determinant. existing using lu
      - cholesky().
      - kron
      - trace
      - mldivide (see the Kian Ming implementation)
      - A simple solve() using the LU decomposition.
      - I have collected a bunch of these, but Kian Ming's are probably much
      better.

*7. Move Semantics*

   - To enable matlab like access, we would really like to be able to write
   our own functions and do things like:

matrix<double> A_inv = inv(A);

   - For some inv() function. Of course, this is only efficient if done
   with rvalue references and move semantics, but my understanding is that
   boost::move will work with older compilers if we enabled some of the ublas
   constructors to be move aware. Am I wrong? If not, this could be extremely
   useful and enable much cleaner semantics on

It appears that Kian Ming has already done much of this work. Are there
people who are competent enough to merge these sorts of changes into a
future ublas version?