Boost logo

Boost Users :

Subject: Re: [Boost-users] [multi_array] Why the precondition on assignment?
From: alfC (alfredo.correa_at_[hidden])
Date: 2009-02-27 19:42:12


> > something like
> > A.reserve({{shape1,shape2,shape3}}); // or just reserve
> > (shape1*shape2*shape3)
>
> Would this still make it necessary to write an operator= in a class
> that has a multi_array as a member?

Yes you will still need to write such operator, but at least it *can*
be done efficiently (i.e. without useless copying that is performed if
you resize (with A.resize()) the matrix first).

I proposed and I had been thinking for a long time in a "reserve"
method keyword for two reasons:

* First, because I think that, with the current design, it is the only
way to implement a resize-and-assign efficiently, then you can wrap
this into a new object that has your desired semantics (I propose you
to call it small_multi_array for example).

* Second, for some numerical libraries I sometimes need to allocate
extra storage beyond the end of the multi_array!! And it is not an
esoteric numerical library, it is the famous FFTW3, which for the MPI
version needs to allocate some more space than the one needed for the
multidimensional array because of the algorithmic requirements.
http://www.fftw.org/fftw3.3alpha_doc/Simple-MPI-example.html#Simple-MPI-example.
My current approach is very dirty, I have an object that has a
std::vector v which is only used to "reserve" enough space and a
multi_array_ref with proper dimensionality and shape which "points" to
&v[0]. In some situations (because of needs of FFTW3) the size (or
reserved space) of the vector is larger than the one needed to
reference all the array indexes. In this way I have total control of
the allocated memory of the multi_array (which is not a multi_array
anymore but a multi_array_ref). I have to do that just because there
is no "reserve".

As you see you are not the only one having to do dirty tricks with
multi_array. At this point you may ask, why using multi_array at all?
well, I still find very convenient other features of the library like
straightforward indexing, index bases, strides, subarrays and
arrayviews.

> That's the main thing I'd like to avoid as it introduces a maintenance burden to ensure all the other
> members are copied.

Yeamm, sorry, it doesn't solve that problem. But at least with
'reserve' it *can* be solved.

Let's see, lets try to find a solution at a higher level. Programs
that deal with arrays, for example Matlab don't have this problem on
assignment (op=) because they use copy-on-write for arrays in the
first place. Which means that nothing is reallocated or copied on
calling operator= but then when *at least one* element is modified the
reallocation and copy happens together (like with the current resize
()! ). BTW, any body know what is the underlying Fortran strategy?
Maybe we should stop thinking about "improving" the MultiArray and
taking it as given, and start thinking on adapting it nicely (with a
small layer of code on top of MultiArray) into a shared/copy-on-write
type pointer that resembles as much as possible the multi_array. It is
not that I know what to do exactly, I am trying to think out loud (and
hoping the smart Boost developers to hear).

Going back to your specific problem: If the arrays you are handling
are big, didn't you think of keeping the multi_array in a shared_ptr
(or a copy-on-write sort of thing) that is a member of your class. If
the arrays are small and you can afford reallocation and spurious
copies then wrap the multi_array in a small_multi_array class with the
expected semantics.

> Because of that I think multi_array::operator= must be the exception
> to a strategy on avoiding reallocations. If the existing space can be
> used, then all well and good - have that optimisation.

Given the current design: yes, I agree.

But also remember that MultiArray is not only about multi_array, there
are many other classes in the library where operator= still will work
with the "restricted" semantics; for example, subarrays can still be
assigned but the sizes have to match. For example

multi_array<double, 2> A(extents[5][5]);
multi_array<double, 1> B(extents[5]);
multi_array<double, 1> C(extents[4]);

...currently you can do:
A[3]=B; // A[3] is of type subarray

but you can't do:
A[3]=C; // asserts false in the same way as mismatched multi_arrays

Should we complain about that too because it is an operator= call that
fails in some cases?

Alfredo


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net