Boost logo

Boost :

From: Ralf W. Grosse-Kunstleve (rwgk_at_[hidden])
Date: 2002-01-20 08:31:28


> > FWIW: a few theoretical remarks (born out of very pragmatic needs):
> > ...
> > Memory management:
> > Ultimately any array is a one-dimensional array. Here I
> > see six important memory management policies that are
> > useful in different contexts:
> > - automatic allocation, fixed size() (aka boost::array)
> > - automatic allocation (fixed capacity), variable size()
> > - dynamic allocation, fixed size()
> > deep copy semantics
> > reference counted
> > - dynamic allocation, variable size() (e.g. std::vector)
> > deep copy semantics
> > reference counted
>
> is'nt the difference between e.g. 'automatic allocation,
> fixed size' and 'dynamic allocation fixed size' not more a
> question of the way you instantiate it instead of being a
> policy ? (just to sync terminology to make sure that I know
> what you're saying)

Do you mean
    auto_fixed_size<int, 3> a;
vs.
    auto_fixed_size<int, 3> *a = new auto_fixed_size<int, 3>;
?
To me it seems that the "new" statement throws you back to
the stone ages w.r.t. memory management.
smart_ptr is an improvement, but the expression for
allocation becomes horrendous, and access to the interface
of the embedded vector requires dereferencing.

> > A slice is just a special access scheme, or a generalized
> > simple access scheme.
>
> are'nt you mixing storage scheme and access schemes. After
> all, we want all matrices (sparse, skyline, dense, ....) to
> have the same interface (and thus 'access'-eable in the
> same way) but depending on the storage one way provides
> faster access compared to another access method (or
> complexity guarantees or dependent on the storage scheme
> underneath, which is actually similar to the STL)

My suggestions pertain only to dense arrays...
 
> BTW talking about access methods. After the performance
> discussion about multi_array, I'm inclined to say that the
> generic access methods for a multi-dimensional array lib
> are probably not performant enough for numerics without the
> proper specialisations for the 1D and 2D case.
>
> I think the linear-algebra world is big enough to justify
> this specialisation.

... because (1) we do not work with sparse arrays (yes, not
a good reason, but), (2) my instincts tell me that a uniform,
generic treatment of dense and sparse arrays will incur too
many performance penalties. However, I believe that a
uniform treatment of dense arrays is feasible.
 
> > A partial implementation of these ideas can be found in the
> > cvs tree at cctbx.sf.net (e.g. shared_storage.h, ndim.h,
> > loops.h; maps/ndim.h implements a specialized accessor;
> > naming is still a bit disorganized). We are currently
> > refining the design. I would be happy to share more details
> > if there is an interest.
>
> I'm interested. Does this mean that you have implemented another
> matrix-library (no blitz, mtl, ublas) ?

No. What I have implemented at this point is much less
ambitious than the packages you mention, and not even
accessible as a separate package. My ideas and the code
evolved with an application. This application is meant to
become a large software system, so I find it necessary to
generalize as much as possible without losing contact with
my "real" goals (which is the implementation of a system
for crystallographic computations).
Also, my previous message did not address operator
overloading at all. But see below.

Let me summarize some basic ideas to see how much you can relate
to them:

  - A multi-dimensional array is a one-dimensional array with
    specialized access methods.
  - The access method is passed as a template parameter to
    the multi-dimensional array type.
  - Reference counting (via boost::shared_array<char> or similar).
  - Vector-type-casts supported by a memory_handle() method
    which exposes the boost::shared_array<char>.
    This enables (1) reuse of a memory block for different
    purposes (I really need this), (2) using the same
    array as an array of std::complex<T> or an array of T
    (once we have Enhanced POD's this will even be portable).
  - Iteration is provided via nested_loop (cctbx/loops.h).
    => The concept of iteration is separated from the
       concept of indexing.
  - different types of memory management are unified via
    a "reference" (currently called vecref and vecref_nd).
    Benefits: 1: max performance, 2: reduce code bloat
    (compiler generates code only for vecref, not for four
    different vector types).

Access methods currently implemented:
  dimension<N>: 0-based indexing (C and Fortran storage order)
  dimension_p1<N>: 0-based with modulus operation (periodic lattice)

Ideas for future:
  - Enhance shared_storage to allow for dynamic growing.
    (Think std::vector with reference counting.)
  - Adapt methods for non-0-based indexing methods from multi_array.
  - Have a consistent family of array types (with better names).
  - Expose to Python: for ND we need a run-time equivalent of the
    compile-time dimension<N>. Otherwise there will be too many
    Python types.

Operator overloading:
  - At the moment I envision (have approximations of) four
    different types for the four memory management models
    (see previous message). Each of these should have /no/
    operators defined except for operator[] (1-dim access)
    and operator() (accessor access).
  - There are several sensible groups of operators for
    multi-dimensional arrays (elementwise, matrix,
    reductions). Multiplying the number of sensible
    groups by four and implementing a special type
    for each combination would be a horror.
  - A while ago I suggested the "using namespace" approach
    for the selection of operators in a given scope.
    This has not created much enthusiasm but, with the
    means provided by ISO C++ (**), I still see this is as
    the only workable approach. It also enables the
    definition and usage of custom algebras. Combined with
    custom access schemes, this results in a highly
    reusable system.

Ralf

(**) See my next posting.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk