Boost logo

Boost :

Subject: [boost] Can anyone take a quick read of an array type?
From: Daryle Walker (darylew_at_[hidden])
Date: 2013-05-04 06:04:33


I banged out a little something at <https://github.com/CTMacUser/ArrayMD>.

The "boost::container::array_md<T, N>" class template can be used like the
(C++11) std::array. It has the same members (value_type, size_type,
difference_type, reference, const_reference, pointer, const_pointer,
iterator, const_iterator, reverse_iterator, const_reverse_iterator, at,
front, back, operator[], data, begin, cbegin, end, cend, rbegin, crbegin,
rend, crend, empty, size, max_size, fill, and swap). It has external
functions operator==, operator!=, and swap. It fulfills the Container and
Reversible-Container archetypes, with random-access iterators.

It lacks:
* support for 0 as the length. It combines the type and extent directly as
the internal array's type expression, so the compiler will complain if
zero-length arrays are used. (You also can't use a function type,
reference, or class-type that's marked abstract as element types.) But
front, back, and data will always be valid.
* The ordered-comparison operators (<, >, <=, and >=).
* The get access function, plus helper specializations for std::tuple_size
and std::tuple_element.

It adds:
* 3 class-static constants: dimensionality (= 1), static_size (= N), and
static_sizes (a one-element array with value N).
* 2 type aliases: direct_element_type (= value_type) and data_type (=
direct_element_type[N]).
* If value_type is an array type, fill still works, by doing assignments at
the inner non-array level.
* The apply member functions, that call a function over each element,
including the element and its index coordinate with each call.
* operator(), which works just like operator[].
* As many member functions work with constexpr and/or noexcept as possible.

Array_md is an aggregate type. It is trivially-copyable, trivial,
standard-layout, and/or POD if the element type is. It has default-, copy-,
move-construction, destruction, copy-, and/or move-assignment if the element
type does. (Default-initialization follows the element type's lead if its
code-ful or random garbage bits.) Initialize it like a structure with a
single member object of type data_type.

Indicated by the "md" suffix, the array modeled by this class template can
be multi-dimensional. Unlike C-level arrays, which are all nested one-level
arrays, this class template lets you express multi-dimensionality flatly.
And the arguments are compatible with the comma-separated lists used in
variadic stuff.

When you add another extent M (BEFORE the others):
* ++dimensionality
* static_size *= M
* static_sizes -> { old static_sizes, M }
* direct_element_type = old data_type
* data_type = direct_element_type[ M ]
* The apply method sends the remote function each element along with all of
its index coordinates (dimensionality + 1 arguments total).
* operator[] still takes one argument, and returns a direct_element_type&.
You can continue to chain-index to the element type just like nested
built-in arrays.
* operator() and at can take as many arguments as dimensionality. Using
that many indices gives an element, using less gives a reference to an
intermediate array (which you can use the built-in operator[] on). If you
use no indices, then you get a reference to the entire implementation
C-level array returned. (I originally had operator() put in since
operator[] can't handle multiple direct arguments.)

You can crank the recursion backwards, and have an array with no extents.
Such a type supports one element (NOT zero, since one is the multiplicative
identity). That's another reason not to special-case zero-value extents in,
to keep the size non-decreasing with new extents. This degenerate case
doesn't support static_sizes (since it would be an illegal zero-length
array) or operator[] (since there's no array member object).

Added late to the cycle:
When perusing the C++11 standard, I noticed that operator[] can take a
std::initializer_list argument, although the core language never uses it;
it's only for UDTs. I added support for that. I added support for it in
the operator() and at methods, for symmetry. The length of the list must
equal dimensionality.

To-do:
Should I implement get, tuple_size, tuple_element, and the
ordered-comparison operators? They're very linear-oriented. I'm not sure
they're used that much. And I'm not sure what they would even mean for a
non-linear container; I may have to apply some sloppy linearity for them to
be defined. (I don't think I could make them (get, tuple_size, and
tuple_element) multi-coordinate; then they couldn't work with generic code.)

I was trying to avoid any linearity, and depend on apply for
iteration-related tasks. Then I decided to support the range-for statement.
Then it looked weird to have a sloppy & minimal begin/end tacked on, so I
put in a full iteration set. Then at that point, I put in so many things
like std::array that I put everything else in so they could have source
compatibility.

(BTW, there are several utility/helper classes and functions. Some are in
separate headers, the others near the start of "array_md.hpp".)

Versus Boost.MultiArray:
(This is from a quick look at BMA.) BMA statically declares the element
type and the number of dimensions; the actual extent values (and therefore
the element count) and the elements themselves are determined/allocated at
run-time. It's like std::vector vs. std::array, the former is more
powerful, but may be overkill that limits compile-time opportunities, while
the latter can take advantage of compile-time optimizations and can shift
memory usage from the heap to the stack.

Versus Brian J Smith's Array (at <https://github.com/BrianJSmith/Array>), on
the review queue:
* (Used to support extent values of 0, but has been removed).
* Requires at least one extent, while array_md supports 0 extents.
* Implements multi-dimensions as maps::array objects nesting each other,
similar to how C-level arrays are made multi-dimensional. The array_md
model computes the nested C-level array type and stores that directly. This
has big consequences, because of padding. Class types may add padding bytes
before(?), between, and after their non-static member data sub-objects.
Built-in array types never have padding. So maps::array may end up with
lots of padding bytes scattered through it, while array_md has one set of
padding and its elements contiguous. This makes implementing the data
method possible; it can't work with maps::array unless we get lucky and the
compiler doesn't add any padding. (The data method isn't provided by
maps::array anyway; the "data" name is used differently.)
* Due to the above, maps::array matches nested C-level arrays in that
range-for iterates over the immediately-lower level of elements. This is OK
when dimensionality is 1, but higher dimensions can't dig into the element
type through range-for. The array_md class's range-for support is at the
element level, due to being able to define the data method.
* There are operators * (unary) and + (binary) overloads. They seem more
for iterators than containers; then I figured out that he's simulating the
effects of array-to-pointer conversions, which make these operations
possible. That decay action is something we shouldn't be exalting. (And
there isn't a reversed version for offset + container.)
* There's no option to input all the indices at once for "at," instead
maps::array needs to chain them. (There's no throw-less version called
operator(), needed due to operator[] being limited to one argument.)
(There's no initializer-list variants either; but the maps::array code has
no C++11 features besides variadic templates.)
* There is stream-output code.
* There is ordered-relations operator (< > <= >=) code.
* Revealing the values for the dimensions outside of the template header
differ.
* Actually has documentation, as opposed to Doxygen comments.

Daryle W.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk