From: Ralf W. Grosse-Kunstleve (rwgk_at_[hidden])
Date: 2002-01-20 08:31:28
> > FWIW: a few theoretical remarks (born out of very pragmatic needs):
> > ...
> > Memory management:
> > Ultimately any array is a one-dimensional array. Here I
> > see six important memory management policies that are
> > useful in different contexts:
> > - automatic allocation, fixed size() (aka boost::array)
> > - automatic allocation (fixed capacity), variable size()
> > - dynamic allocation, fixed size()
> > deep copy semantics
> > reference counted
> > - dynamic allocation, variable size() (e.g. std::vector)
> > deep copy semantics
> > reference counted
> is'nt the difference between e.g. 'automatic allocation,
> fixed size' and 'dynamic allocation fixed size' not more a
> question of the way you instantiate it instead of being a
> policy ? (just to sync terminology to make sure that I know
> what you're saying)
Do you mean
auto_fixed_size<int, 3> a;
auto_fixed_size<int, 3> *a = new auto_fixed_size<int, 3>;
To me it seems that the "new" statement throws you back to
the stone ages w.r.t. memory management.
smart_ptr is an improvement, but the expression for
allocation becomes horrendous, and access to the interface
of the embedded vector requires dereferencing.
> > A slice is just a special access scheme, or a generalized
> > simple access scheme.
> are'nt you mixing storage scheme and access schemes. After
> all, we want all matrices (sparse, skyline, dense, ....) to
> have the same interface (and thus 'access'-eable in the
> same way) but depending on the storage one way provides
> faster access compared to another access method (or
> complexity guarantees or dependent on the storage scheme
> underneath, which is actually similar to the STL)
My suggestions pertain only to dense arrays...
> BTW talking about access methods. After the performance
> discussion about multi_array, I'm inclined to say that the
> generic access methods for a multi-dimensional array lib
> are probably not performant enough for numerics without the
> proper specialisations for the 1D and 2D case.
> I think the linear-algebra world is big enough to justify
> this specialisation.
... because (1) we do not work with sparse arrays (yes, not
a good reason, but), (2) my instincts tell me that a uniform,
generic treatment of dense and sparse arrays will incur too
many performance penalties. However, I believe that a
uniform treatment of dense arrays is feasible.
> > A partial implementation of these ideas can be found in the
> > cvs tree at cctbx.sf.net (e.g. shared_storage.h, ndim.h,
> > loops.h; maps/ndim.h implements a specialized accessor;
> > naming is still a bit disorganized). We are currently
> > refining the design. I would be happy to share more details
> > if there is an interest.
> I'm interested. Does this mean that you have implemented another
> matrix-library (no blitz, mtl, ublas) ?
No. What I have implemented at this point is much less
ambitious than the packages you mention, and not even
accessible as a separate package. My ideas and the code
evolved with an application. This application is meant to
become a large software system, so I find it necessary to
generalize as much as possible without losing contact with
my "real" goals (which is the implementation of a system
for crystallographic computations).
Also, my previous message did not address operator
overloading at all. But see below.
Let me summarize some basic ideas to see how much you can relate
- A multi-dimensional array is a one-dimensional array with
specialized access methods.
- The access method is passed as a template parameter to
the multi-dimensional array type.
- Reference counting (via boost::shared_array<char> or similar).
- Vector-type-casts supported by a memory_handle() method
which exposes the boost::shared_array<char>.
This enables (1) reuse of a memory block for different
purposes (I really need this), (2) using the same
array as an array of std::complex<T> or an array of T
(once we have Enhanced POD's this will even be portable).
- Iteration is provided via nested_loop (cctbx/loops.h).
=> The concept of iteration is separated from the
concept of indexing.
- different types of memory management are unified via
a "reference" (currently called vecref and vecref_nd).
Benefits: 1: max performance, 2: reduce code bloat
(compiler generates code only for vecref, not for four
different vector types).
Access methods currently implemented:
dimension<N>: 0-based indexing (C and Fortran storage order)
dimension_p1<N>: 0-based with modulus operation (periodic lattice)
Ideas for future:
- Enhance shared_storage to allow for dynamic growing.
(Think std::vector with reference counting.)
- Adapt methods for non-0-based indexing methods from multi_array.
- Have a consistent family of array types (with better names).
- Expose to Python: for ND we need a run-time equivalent of the
compile-time dimension<N>. Otherwise there will be too many
- At the moment I envision (have approximations of) four
different types for the four memory management models
(see previous message). Each of these should have /no/
operators defined except for operator (1-dim access)
and operator() (accessor access).
- There are several sensible groups of operators for
multi-dimensional arrays (elementwise, matrix,
reductions). Multiplying the number of sensible
groups by four and implementing a special type
for each combination would be a horror.
- A while ago I suggested the "using namespace" approach
for the selection of operators in a given scope.
This has not created much enthusiasm but, with the
means provided by ISO C++ (**), I still see this is as
the only workable approach. It also enables the
definition and usage of custom algebras. Combined with
custom access schemes, this results in a highly
(**) See my next posting.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk