In many places in my algorithms, I am stacking small vectors.  I need semantics where I can call functions like the following:

template<int N, int M>
bounded_vector<double, N+M> my_func(const  bounded_vector<double, N>& v1, const  bounded_vector<double, N>& v2)
{

  bounded_vector<double, 2> v1;
  bounded_vector<double, 3> v2;
  //.... stuff
  return another_func(stack_vector(v1, v2));
}

My stack vector (specialized for the bounded_vector) is as you would expect.
template<class T, int N, int M>
ublas::bounded_vector<T, N+M> stack_vec(const ublas::bounded_vector<T, N>& v1, const ublas::bounded_vector<T, M>& v2)
{
ublas::bounded_vector<T, N+M> result;
subrange(result, 0, v1.size()) = v1;
subrange(result, v1.size(), v1.size() + v2.size()) = v2;
return result;
}

And a more general version, but which requires a dynamically allocated temporary:

template<class Vector_T1, class Vector_T2>
ublas::vector<double> stack_vec2(const ublas::vector_expression<Vector_T1>& v1, const ublas::vector_expression<Vector_T2>& v2)
{
std::size_t s1 = v1().size();
std::size_t s2 = v2().size();

ublas::vector<double> result(s1 + s2);

subrange(result, 0, s1) = v1;
subrange(result, s1, s1 + s2) = v2;
return result;
}


I am terrified that I will have 1 temporary caused inside of the stack_vector.  Another calculating the transformation within another_func (because it is unlikely to be able to use the return value optimization), and then a vector copy from  the caller of my_func.  Obviously I can't get rid of them all, but it sure would be nice to get rid of some of them.


I never really have figured out the ublas expression templates or there limitations, but I was wondering if it is possible to return an expression template to eliminate at least some of this.  Any idea of how it would work or if it is a good idea?  Is it hard?

(I am looking for something similar with matrix stacking, but that is another day).


Thanks,
Jesse