Boost logo

Boost :

Subject: Re: [boost] [multi_array] Is there interest for an [index_list] library?
From: Pierre-Andre Noel (noel.pierre.andre_at_[hidden])
Date: 2011-04-11 16:15:10


Sorry for the slow reply...

> Does the functionality work atop the MultiArray concept or atop
> boost::multi_array and boost::multi_array_ref?

Most of the proposed code works "by itself" (without need for MultiArray
and/or boost::multi_array and related). Some constructors have the option of
receiving a model of the MultiArray concept in order to be "compatible" with
it. The iterator I was talking about in the earlier message is definitely
made to work with the MultiArray concept. Finally, for convenience, some
"utilities" directly refer to boost::multi_array when defining objects that
are likely to come up in the application.

(Note that what I propose are new concepts, not changes to already existing
ones.)

> One can linearly walk
> the contents of the latter using iterators foo.data() and foo.data() +
> foo.dimensionality. I think only the MultiArray concept would need
> further capabilities for a linear walk independent of the
> dimensionality (as .data() isn't part of the concept).

You probably meant foo.data() and foo.data() + foo.num_elements(). Yes, it
is true that you can visit all the elements that way. In fact, this is the
first example in the file sandbox/index_list/libs/index_list/examples.cpp .

> A "position-dependent" behavior can be achieved (atop multi_array and
> multi_array_ref) by taking the distance between foo.data() and the
> current iterator and then playing some games with foo.strides() or
> std::distance(). I believe the same can be achieved atop the pure
> MultiArray concept using foo.origin(). No need to extend the concept
> (though the required couple of lines would be frustratingly lengthy).

My "personal library" that is the ancestor of the current IndexList was
initially composed of those "frustratingly lengthy" couples of lines. With
time, I added functionalities. Since implementing simplex_domain was
"non-trivial", I thought that others could be interested: I cleaned up the
code and sent it to the sandbox before posting the previous message.

The "Domain" concept defined in the IndexList library generalizes the
conversion you are talking about. Passing a number in the range
[0,num_elements()-1] to the "expand()" member of a Domain returns a
collection of indices, and passing that collection of indices to its
"reduce()" member will return the original number. Given any collection of
indices, a Domain's "is_valid()" member will return true if this collection
could have been obtained from a call to "expand(i)" using "i" in range
[0,num_elements()-1], and false otherwise.

For box_domain, "reduce()" reproduce the same algorithm Boost.MultiArray
uses to locate an element in memory (using shape() and index_bases()),
"expand()" performs the opposite task as you earlier proposed and
"is_valid()" simply compare each element of the provided collection of
indices to index_bases and index_bases + shape.

> Aside from runtime dimension
> selection, what does 'box' add beyond the current extent_gen-based
> capabilities. (Compile-time) dimension-independent code can be
> written by templating on the MultiArray NumDims parameter
> (
http://agentzlerich.blogspot.com/2010/01/providing-fill-and-foreach-algorithms.html
> is an example).

Excluding runtime dimensionality, the sole advantages I see for data
organized in a "box" way are:
- it facilitate coding concerning the current position using "expand()",
- it lighten the treatment of "boundary conditions" using "is_valid()" (By
boundary conditions, I refer to the special treatment that is sometime
required, in the form of "ifs", around the edges of the MultiArray.) and
- for the cases where, although it is ok to store the data using a "box"
organization, the same data has to be accessed in a different way that is
not trivial to implement (The access could then be made using a different
domain than a box_domain.).

> Runtime-decisions about the dimensionality of a
> MultiArray definitely are interesting but I'm unimaginative and can't
> think of the use case...

For me it is quite common, but I guess scientific computing is the special
case... From my experience, runtime dimensionality is mostly required when
you have to read the data from a file of unknown dimensionality and when
"main(int argc, char *argv[])" receives arguments that affects the
dimensionality.

> I'm afraid I can't begin to speak intelligently about the simplex
> organization...

In two dimensions, a simplex_domain could have the following valid elements:
(0,0), (0,1), (1,0), (0,2), (1,1), (2,0), (0,3), (1,2), (2,1), (3,0). In
other words, the triangle contained within the box limited by (0,0) and
(3,3), with the further constrain that the sum of all the indices does not
exceed 3.

Note that one could simply use a multi_array(boost::extents[4][4]) and
discard the unused elements. For the simple example shown, proceeding that
way could probably be an ok solution, since there would only be 6 unused
elements out of a total of 16 elements in the MultiArray. For larger domains
in two dimensional space, the limiting behaviour is that 50% of the elements
will have to be unused.

However, if you have to represent data that follow such a simplex topology
in high dimensional space, the difference grows much larger between the
required simplex and the smallest box in which this simplex "fits in". This
is covered in the example file.

> I don't think I'm your target audience so take all of this with a
> grain of salt. I very much appreciate your attempts to further
> improve the MultiArray concept as I use it heavily in my own work.
> Also, just cause I find it interesting, ndarray is a fairly cool
> alternative (https://code.google.com/p/ndarray/).

Thanks for the link!


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk