ublas: questions from a newcomer

I am interested in ublas for linear algebra. In addition to the simple operations supported by ublas I will need comprehensive access to lapack for the real work. I see in the wiki that some work has been done in this area. Is this numeric/binding stuff working, and can it be readily extended to the full set of lapack routines? Some kind of meta program to generate lapack interfaces from a parameter spec would be nice (ruby/perl? or just a description of what needs to be done) Then I could happily bring up the interfaces I need. Is there any way to generate vectore and matrices which do not own the data buffer which they provide access to - 'view' or 'sub' objects in other lin alg packages? This would be very valuable to me for interfacing existing code, without duplicating storage and doing unnecessary copies. I found it difficult to determine from the current documentation. I have one worry: Ublas, and most of the boost modules, seems to me not for the faint hearted. They are clearly powerful, but seem to expect a lot from a potential user. Are these facilities aimed at library writers rather than users whoes skill focus may lie in other domains? I am thinking of apl (and related J++ (not java)) which is very powerful, but has only minority appeal because it is too alien and unnatural for ordinary mortals. The advocates take great pride in how powerful a small and almost incomprehensible line of code can be (FFT in one short line etc.), and implicitly how clever they are to generate and understand it. I expect that a user level doc may eventually make ublas/lapack much more approachable by a non computer science mathematician or engineer. Forgive me if the above sounds critical and ungracious for free software, but I want to use boost and I am worried that it may be too demanding for its benefites to be justified for use by the group of people I work with. -- Stuart

----- Original Message ----- From: stuart macgregor To: boost-users@yahoogroups.com Sent: Friday, March 21, 2003 2:26 PM Subject: [Boost-Users] ublas: questions from a newcomer
I am interested in ublas for linear algebra. In addition to the simple operations supported by ublas I will need comprehensive access to lapack for the real work.
I see in the wiki that some work has been done in this area. Is this numeric/binding stuff working,
I believe, Toon and Kresimir have working code using the bindings.
and can it be readily extended to the full set of lapack routines?
Are you kidding? One can easily call the original LAPACK routines using the ublas accessors to internal data structures. But then one has to cope with the fortran signatures and there would be some 'impedance mismatch'. But I believe we're exactly talking about wrapping this (for some 1200 functions IIRC)?
Some kind of meta program to generate lapack interfaces from a parameter spec would be nice (ruby/perl? or just a description of what needs to be done) Then I could happily bring up the interfaces I need.
Interesting idea. I'll have to think about it.
Is there any way to generate vectore and matrices which do not own the data buffer which they provide access to - 'view' or 'sub' objects in other lin alg packages? This would be very valuable to me for interfacing existing code, without duplicating storage and doing unnecessary copies. I found it difficult to determine from the current documentation.
ublas has an undocumented (reference counted) storage container array_adaptor<> for this.purpose. It is a bit dangerous as it changes the usual assignment semantics in certain cases. I needed it to adapt ublas for use in CLAPACK.
I have one worry: Ublas, and most of the boost modules, seems to me not for the faint hearted. They are clearly powerful, but seem to expect a lot from a potential user.
Pretty unsure. But compiler diagnostics clearly are a pain sometimes.
Are these facilities aimed at library writers rather than users whoes skill focus may lie in other domains?
I believe they are made for C++ programmers which like to use the standard library ;-)
I am thinking of apl (and related J++ (not java)) which is very powerful, but has only minority appeal because it is too alien and unnatural for ordinary mortals. The advocates take great pride in how powerful a small and almost incomprehensible line of code can be (FFT in one short line etc.), and implicitly how clever they are to generate and understand it.
I expect that a user level doc may eventually make ublas/lapack much more approachable by a non computer science mathematician or engineer.
Definitely.
Forgive me if the above sounds critical and ungracious for free software, but I want to use boost and I am worried that it may be too demanding for its benefites to be justified for use by the group of people I work with.
Sorry, can't help here. Best, Joerg

stuart macgregor wrote:
I am interested in ublas for linear algebra. In addition to the simple operations supported by ublas I will need comprehensive access to lapack for the real work.
I see in the wiki that some work has been done in this area. Is this numeric/binding stuff working,
Atlas bindings work for me in the `real' code (but I am biased ;o) I am also using SuperLU bindings for sparse matrices, but they are not in boost-sandbox yet.
and can it be readily extended to the full set of lapack routines?
Some time ago Toon wrote in boost-ml: ``Note that the bindings are not complete but the 'framework' is there and it's just a matter of time to complete them.'' Bindings are mostly a free-time activity. Or, one can say that we try to generalize some more or less ad-hoc parts of our `regular' work. So `matter of time' probably means `some (considerable) time' :o(
Some kind of meta program to generate lapack interfaces from a parameter spec would be nice (ruby/perl?
Interesting idea.
or just a description of what needs to be done) Then I could happily bring up the interfaces I need.
As I said, bindings are free-time activity, so any help is welcomed. Regarding (c)lapack part of atlas bindings, there are interfaces for all (interesting) functions which atlas provides (gesv, getrf, getrs, getri, posv, potrf, potrs, potri; only lauum and trtri are missing), see http://math-atlas.sourceforge.net/psdoc/lapackqref.ps.gz Lapack bindings are Toon's realm, but I hope that my understanding is good enough to (try to) explain what needs to be done. As an example, I'll use function `getrf' (which is already there). As you know, there are four versions of each lapack function -- float, double, complex, complex double. AFAICS Toon decided to write bindings only for double and complex double versions, but other two can be easily added following the same pattern. I'll use only dgetrf here. Include files are in boost/numeric/bindings/lapack and the `traits.cpp' file is in libs/numeric/bindings/lapack (1) `lapack.h' contains declarations of lapack functions: As there are several conventions for calling Fortran from C/C++ (some compilers require trailing underscore in function names, some don't, see excelent page http://www.aei.mpg.de/~jthorn/c2f.html and, in particular, section `Name mangling'), first a macro LAPACK_DGETRF is defined, depending on naming convention: ============================================= #if defined (BIND_FORTRAN_LOWERCASE_UNDERSCORE) #define LAPACK_DGETRF dgetrf_ #elif defined(BIND_FORTRAN_LOWERCASE) #define LAPACK_DGETRF dgetrf #endif ============================================= (macros BIND_FORTRAN_... are defined in boost/numeric/bindings/traits/fortran.h). Now lapack function can be declared: ============================================= void LAPACK_DGETRF(const int* m, const int* n, double* a, const int* lda, int* ipiv, int* info) ; ============================================= All parameters are pointers. (2) `traits.hpp' contains class template `traits'. For each lapack there is a typedef of a function pointer, which corresponds to the function declaration in `lapack.h': ============================================= typedef void (*getrf_type)(const int* m, const int* n, bind_type* a, const int* lda, int* ipiv, int* info) ; ============================================= but instead of `double*' there is a `bind_type*', which depends on the template parameter `T' of the class `traits'. `bind_type' is defined using `value_traits' class (which is defined and specialized in boost/numeric/bindings/traits/value_traits.hpp). Class `traits' also contains a (static) function pointer: ============================================= static getrf_type getrf ; ============================================= (3) in `traits.cpp' the function pointer is initialized with (the address of) the lapack function: ============================================= template<> traits<double>::getrf_type traits<double>::getrf = LAPACK_DGETRF ; ============================================= `traits.cpp' must be separately compiled and linked with your program. But, in fact, here we have a problem -- see comment [4] below. (4) `lapack.hpp' contains the C++ interfaces (or bindings). All bindings (blas, lapack, atlas) are written in terms of traits classes which are interfaces between these functions and vector/matrix classes. Namely, idea is to make bindings independent of the concrete vector/matrix libraries as much as possible. Lapack functions require the starting address of the matrix, number of rows and columns and leading dimension. So, matrix_traits class, that is, its specializations provide this functions and some typedefs. For example, size1() (i.e. number of rows) in matrix_traits specialization for ublas::matrix traits is static int size1 (matrix_type& m) { return m.size1(); } and in specialization for Array2D from Roldan Pozo's TNT static int size1 (matrix_type& m) { return m.dim1(); } If you use ublas, you don't need to worry about matrix_traits, because they are already written -- you just have to include boost/numeric/bindings/traits/ublas_matrix.hpp for general matrices and matrix proxies and boost/numeric/bindings/traits/ublas_symmetric.hpp for symmetric matrices and symmetric adaptors. (Traits classes for Roldan Pozo's TNT are also there, but I didn't test them thoroughly.) Note that bindings don't know and need not know anything about traits specializations -- boost/numeric/bindings/traits/ublas_matrix.hpp etc. must be included in your program, not in bindings headers. To simplify the use of traits classes, Toon recently introduced some free accessor function, so instead of writting matrix_traits<matrix_type>::size1 (m) you can write simply matrix_size1 (m) Now, let's see how all this works in the interface for getrf (explanations are given as comments; this is Toon's code and I'll also comment some implementation details -- longer comments, not really important for functionality, are given after the code): ============================================== template < typename matrix_type, typename IpivIterator > int getrf (matrix_type& a, IpivIterator begin_ipiv, IpivIterator end_ipiv) // instead of 6 parameters of lapack's (sdcz)getrf, we have only // 3 parameters: matrix to factor and iterator range for pivot vector // (but see also comment [1] below) { using namespace boost::numeric::bindings::traits ; // why? all names below are fully namespace qualified typedef typename matrix_type::value_type value_type ; // see comment [2] below typedef typename boost::numeric::bindings::traits::value_traits< value_type
::value_type bind_type ; // as in `traits.hpp' in (2) above
// int m = boost::numeric::bindings::traits::size1( a ) ; // int n = boost::numeric::bindings::traits::size2( a ) ; // number of matrix rows and columns; // free accessor functions are used, but in new version they are // matrix_size1() and matrix_size2(), so this will not compile :o( // correct calls are: int m = boost::numeric::bindings::traits::matrix_size1( a ) ; int n = boost::numeric::bindings::traits::matrix_size2( a ) ; value_type *a_ptr = boost::numeric::bindings::traits::matrix_storage( a ) ; const int lda = boost::numeric::bindings::traits::leading_dimension( a ) ; // starting address of matrix storage and its leading dimension assert( std::distance( begin_ipiv, end_ipiv ) >= std::min( m, n ) ); // pivot vector must be large enough int info = 0; // return value; in short: is matrix singular? int& ipiv_ptr = *begin_ipiv ; // reference to first element of the pivot vector, // &ipiv_ptr is then its starting address; // but see also comment [3] traits<value_type>::getrf( &m, &n, (bind_type*)a_ptr, &lda, &ipiv_ptr, &info ); // finally, call lapack's function return info; } ============================================== Comments: [1] I think that this interface [i.e. func (matrix&, begin_iter, end_iter)] is mixing of two idioms -- why STL's idiom of iterator ranges [begin, end) here, and not for other vectors in e.g. blas 1&2 bindings? Interface of getrf in atlas bindings (atlas/clapack.hpp) is simply: int getrf (MatrA& a, IVec& ipiv); [2] It is true that both ublas::matrix<> and TNT::Array2D<> define value_type, but (in ideal world ;o) bindings should work for other matrix types which can have val_t or something else. And that's why we have matrix_traits; so, instead of typedef typename matrix_type::value_type value_type ; it should be typedef typename matrix_traits<matrix_type>::value_type value_type; [3] See the interface of getrf in atlas bindings (atlas/clapack.hpp) for IMHO more elegant way to do this (yes, I'm probably biased ;o) [4] The biggest problem with blas & lapack bindings is that although they work with g++ (at least with 3.2, I didn't try later versions), they cannot be compiled with Comeau C++ 4.3 --- they are not standard conforming. At first sight, class `traits<>', defined in `lapack/traits.hpp' (and, similarly, class `blas<>' in `blas/blaspp.hpp') is very elegant solution for the function overloading -- one class for all four versions of blas/lapack functions (float, double, complex, complex double). But function pointers defined in `traits<>' and `blas<>' have "C++" linkage and blas/lapack functions have (extern) "C" linkage; therefore, this may, but need not, work: template<> traits<double>::getrf_type traits<double>::getrf = LAPACK_DGETRF ; Here is a quote from the recent post by Greg Comeau in comp.lang.c++.moderated (`Re: use of extern "C"'): =============================================== One reason why this has probably been acceptable in the code you've written is that your compilers are allowing implicit conversions between extern "C" and extern "C++" entities, actually pointers to some such entities, since often the same internal/underlying conventions (calling, representations, etc) are used for each, but from a strict standpoint, the standard does not support that. =============================================== Steve Clamage gave an example in the same thread: =============================================== The extern "C" linkage declaration says that C linkage rules are in effect. If C and C++ functions have different calling sequences, the code generation will be different depending on the declared linkage. The same goes for pointer-to function, since the linkage is part of the pointer type. For example: int f(int); // C++ linkage extern "C" int g( int(*)(int) ); ... g(f); // error Function g is declared to take a pointer to a C-linkage function, and passing a pointer to a C++-linkage function is not valid. Not all compilers complain about the error, however. =============================================== Our example is inverted -- traits<double>::getrf_type is a pointer to a C++-linkage function and LAPACK_DGETRF is (a pointer) to a C-linkage function -- but the same rule obviously applies. I am afraid that blas and lapack bindings need complete rewriting as in atlas bindings which use less elegant and more conservative (but also standard conforming) approach -- simple function overloading: =============================================== int gesv (CBLAS_ORDER const Order, int const N, int const NRHS, float* A, int const lda, int* ipiv, float* B, int const ldb) { return clapack_sgesv (Order, N, NRHS, A, lda, ipiv, B, ldb); } int gesv (CBLAS_ORDER const Order, int const N, int const NRHS, double* A, int const lda, int* ipiv, double* B, int const ldb) { return clapack_dgesv (Order, N, NRHS, A, lda, ipiv, B, ldb); } ===============================================
Is there any way to generate vectore and matrices which do not own the data buffer which they provide access to - 'view' or 'sub' objects in other lin alg packages? This would be very valuable to me for interfacing existing code, without duplicating storage and doing unnecessary copies. I found it difficult to determine from the current documentation.
Joerg answered this one ...
I have one worry: Ublas, and most of the boost modules, seems to me not for the faint hearted. They are clearly powerful, but seem to expect a lot from a potential user.
Well, some boost libraries are certainly frightening, but I don't think that ublas is among them -- its interface seems rather intuitive.
Are these facilities aimed at library writers rather than users whoes skill focus may lie in other domains?
Some very simple, self-explanatory ublas examples can be found in subdirectory libs/numeric/ublas/doc/samples of the boost distribution.
I expect that a user level doc may eventually make ublas/lapack much more approachable by a non computer science mathematician or engineer.
I agree. ublas documentation is good and detailed reference (when you know what you are looking for ;o), but a tutorial with few well commented examples is probably missing. Regarding bindings, I can only hope that I'll find some time in the near future to write some documentation (at least for traits classes and atlas bindings). In the meantime, there are some examples in libs/numeric/bindings/atlas/ (in the boost-sandbox); I tried to write meaningful comments ;o). Begin with ublas_gesv2.cc, then ublas_gesv.cc and ublas_posv.cc. Also, don't hesitate to ask further questions. (BTW, few days ago Toon updated bindings in the sandbox, so it now contains the latest version -- that is, the note on Wiki page is obsolete.)
Forgive me if the above sounds critical and ungracious for free software,
No, not at all ... I had (and have) the same problem with some other libraries ;o)
but I want to use boost and I am worried that it may be too demanding for its benefites to be justified for use by the group of people I work with.
AFAICS, the main problem is the lack of documentation, in particular tutorials and introductory examples. Also, as Joerg suggested, ``compiler diagnostics clearly are a pain sometimes'', but this is a problem with all templated code. Regards, fres
participants (3)
-
jhr.walter@t-online.de
-
Kresimir Fresl
-
stuart macgregor