|
Ublas : |
Subject: Re: [ublas] Deciding on tensor parameters
From: Cem Bassoy (cem.bassoy_at_[hidden])
Date: 2018-09-13 18:06:04
Am Do., 13. Sep. 2018 um 18:12 Uhr schrieb Stefan Seefeld via ublas <
ublas_at_[hidden]>:
> Hi Cem,
>
> thanks for sending this out !
>
> On 2018-09-13 11:34 AM, Cem Bassoy via ublas wrote:
>
> The GSOC 2018 project with the title "Adding tensor support " has been
> succefully completed. Boost.uBlas may support tensors in future. The
> code, project and documentation can be found here
> <https://github.com/BoostGSoC18/tensor> and here
> <https://github.com/BoostGSoC18/tensor/wiki/Documentation>.
>
> The tensor template class is parametrized in terms of data type, storage
> format (first- or last-order), storage type (e.g. std::vector or
> std::array):
>
>
> (Minor nit-pick: it's a class template. There is no such thing as
> "template classes" in C++ :-). I know the existing ublas docs are full of
> that spelling...)
>
>
Actually there is something a template class which is said to be an
instantiation of a class template, see https://isocpp.org/wiki/faq/templates
.
However, I used it wrong here :-)
> template<class T, class F=first_order, class A=std::vector<T,std::allocator<T>>>
>
>
> class tensor;
>
>
> An instance of a tensor template class has dynamic rank (number of
> dimensions) and dimensions using a shape class that holds the data. It
> is a adaptor of std::vector where the rank is the size of it:
>
> // {3,4,2} could be runtime variables of an integer type.
>
> auto A = tensor<float>{make_shape(3,4,2)};
> ------------------------------------------
>
> I am thinking to redesign the tensor template class where the rank is a
> compile time parameter:
>
> template<class T, std::size_t N, class F=first_order<N>, class A=std::vector<T,std::allocator<T>>>class tensor;
>
>
> An instance of a tensor template class could be generated as follows:
>
> // {3,4,2} could be runtime variables of an integer type.auto A = tensor<float,3>(make_shape(3,4,2));
>
>
> This instantiation could be definitely improved. However, having a static
> rank has the following advantages and disadvantages:
>
> -------------
>
> *Advantages*:
>
> 1. improving runtime behavior about 30% to 5 % of basic tensor
> operations ( depends according to my findings on the length of the inner
> most loop ).
> 2. ability to statically distinguish between different tensor types at
> compile time. tensor<float,3> is a different type than tensor<float,4>. If
> so, why not setting matrix as an alias:
>
>
> template <class type, class format, class storage>
> using matrix = tensor<type,2,format,storage>.
>
> We would only need to specify and implement one data structure ' tensor '
> and if needed provide optimized functions for matrices. This simplifies
> the maintenance.
>
>
> A big advantage (which has been my main motivation for pushing for this
> solution) is that such a scenario would be fully in line with the existing
> Boost.uBLAS API, so your work becomes a natural extension of what we
> already have.
>
I think, just the contrary is the case. There would be no extension to the
old dense matrix class template, as the new alias template would replace
the old one because we cannot have the identifier in the same namespace. In
the above case, we need to port all vector and matrix functions for the new
tensor type. The vector and matrix class templates are not alias templates
but are distinct class templates. If I am not mistaken, adding the tensor
as a class template as it is right now would be the uBLAS way.
>
> Alternatively, if you keep the rank a runtime parameter, you are basically
> proposing an entirely new API, which means that Boost.uBLAS users will have
> to decide whether to use the old or the new API, which I'm afraid will
> result in a fragmentation of the community. Likewise, many existing
> operations only support existing vector and matrix types, so maintainers
> will have more work to do to support both APIs.
>
> That, to me as library maintainer, is a very high cost, so I'm reluctant
> to such a change, even if the proposed API with runtime ranks is otherwise
> sound.
>
Yes agree with you on that point.
>
> Also there might be advantages in terms of subtensor and iterator support.
> However implementing them will be harder.
>
> ---------
> *Disadvantages*:
>
> 1. The implementations become more complicated especially for tensor
> multiplications and tensor reshaping.
>
>
> I have worked on a BLAS library with compile-time constant ranks. And
> while capturing parameters such as ranks in the type system itself can
> indeed be a bit of a challenge, I think it's definitely doable, and may
> even lead to clearer code down the road.
>
Yes I agree.
>
>
>
>
>
> 1. With static rank the interfaces are harder to use (setting the rank
> as a template parameter).
>
>
>
>
> That depends on the use case. It simply means that you have to think about
> the rank slightly differently, while writing code.
>
(It could simply mean that you have to drag along an additional template
> parameter, if you want to write generic code. But as I mentioned above,
> this could arguably lead to clearer code, so I consider this a feature, not
> a bug. :-) )
>
Yes I also do not consider it as a bug :-). For us as library designers,
C++ 'experts', this might be nice feature.
But 'normal' users, not library designers, especially those who are used to
matlab, python, octave, scilab etc. are not used to not worry about
template parameters. There still might be a way to elegantly instanstiate
tensors. However, programming and implementing tensor algorithms becomes
definitely harder with template specialization or with if constexpr.
>
>
> 1. The number of contracted dimensions must be known at compile time.
> Therefore, implementing some tensor algorithms would only be possible with
> template specialization instead of simple for loops. Making algorithms
> becomes more difficult.
>
>
> Right.
>
> Although Eigen and Boost.MultiArray decided for compile time, it might be
> a critical point for uBLAS.
>
> I am working on this right now and also I am trying to suppprt p! number
> of linear storage formats as a compile time parameter if p is the rank of
> the tensor. Actually unit-testing becomes very hard as I am not able to use
> fixtures so easily. Supporting static and dynamic rank will be a
> maintenance nightmare.
>
>
> Yeah, the parameter space to cover grows exponentially. But that is true
> no matter whether the rank is determined at compile-time or at runtime. The
> difference is only in whether you use normal functions or meta-functions to
> compute derived ranks, storage formats, et al.
>
Well yes, I experienced difficulty not only in covering the parameter space
but also in setting up the unit tests with fixtures when using template
functions. Well but that might not be the main issue here.
Cheers
C