The GSOC 2018 project with the title "Adding tensor support " has been succefully completed. Boost.uBlas may support tensors in future. The code, project and documentation can be found here and here.

The tensor template class is parametrized in terms of data type, storage format (first- or last-order), storage type (e.g. std::vector or std::array):

template<class T, class F=first_order, class A=std::vector<T,std::allocator<T>>>
class tensor;

An instance of a tensor template class has dynamic rank (number of dimensions) and  dimensions using a shape class that holds the data. It is a adaptor of std::vector where the rank is the size of it:

// {3,4,2} could be runtime variables of an integer type.
auto A = tensor<float>{make_shape(3,4,2)};

---------------------
---------------------

I am thinking to redesign the tensor template class where  the rank is a compile time parameter:

template<class T, std::size_t N, class F=first_order<N>, class A=std::vector<T,std::allocator<T>>>
class tensor;

An instance of a tensor template class could be generated as follows:
// {3,4,2} could be runtime variables of an integer type.
auto A = tensor<float,3>(make_shape(3,4,2));

This instantiation could be definitely improved. However, having a static rank has the following advantages and disadvantages:

-------------

Advantages:
  1. improving runtime behavior about 30% to 5 % of basic tensor operations ( depends according to my findings on the length of the inner most loop ).
  2. ability to statically distinguish between different tensor types at compile time. tensor<float,3> is a different type than tensor<float,4>. If so, why not setting matrix as an alias:

template <class type, class format, class storage>
using matrix = tensor<type,2,format,storage>.

We would only need to specify and implement one data structure ' tensor ' and if needed  provide optimized functions for matrices. This simplifies the maintenance. 
Also there might be advantages in terms of subtensor and iterator support. However implementing them will be harder. 

---------
Disadvantages:
  1. The implementations become more complicated especially for tensor multiplications and tensor reshaping.
  2. With static rank the interfaces are harder to use (setting the rank as a template parameter).
  3. The number of contracted dimensions must be known at compile time. Therefore, implementing some tensor algorithms would only be possible with template specialization instead of simple for loops. Making algorithms becomes more difficult.
Although Eigen and Boost.MultiArray decided  for compile time, it might be a critical point for uBLAS.

I am working on this right now and also I am trying to suppprt p! number of linear storage formats as a compile time parameter if p is the rank of the tensor. Actually unit-testing becomes very hard as I am not able to use fixtures so easily. Supporting static and dynamic rank will be a maintenance nightmare.

Cheers
C