Boost logo

Ublas :

Subject: Re: [ublas] fixed size vector in boost::numeric::ublas?
From: Miguel Lopes Santos Ramos (miguel_at_[hidden])
Date: 2010-10-24 07:59:04


Hi all, and sorry to raise this thread from the dead again.

I don't know if a consensus was already reached about the design of
fixed_vector and fixed_array. I don't read the list often enough, but I
would like to comment now.

I think this matter is difficult and that the present c_vector/c_matrix
classes are a proof of that, they don't achieve the requirements that we
are seeking to have now, their names mislead people into thinking that they
do, and they are redundant considering bounded_vector. I too think they
should be deprecated.

About the requirements discussed:
- I don't think that being able to use reinterpret_cast from a
fixed_vector or fixed_matrix to C-style is the most desirable feature, even
though we would have that. For the purposes discussed, I think a better
feature would be to have a sort of matrix_adaptor class that would allow us
to treat an existing C-style matrix as a matrix_expression.

- I don't think having a fourth template argument on matrix is the best
solution. That fourth argument wouldn't exist to satisfy the user but to
assist the implementor in sharing code with matrix. That fourth argument
could be prejudicial to a future need of adding a fourth template argument
for something more useful which we don't see right now. I think sharing
code would be better achieved by the other design proposed, having a
dense_matrix_base class between matrix_container and matrix.

- The c_vector and c_matrix approach of not using a storage array seems
simpler to me than the other approach discussed of using a fixed_array as a
storage argument to vector and matrix which doesn't work anyway as a way of
"automatically" getting the fixed_vector and fixed_matrix. So, I agree with
most of Nasos' issues.

- The new fixed_matrix can have a bounded_vector as a temporary vector and
not fixed_vector, so as to address issues with projections.

- I would add another requirement, that creating new matrix and vector
classes should be easier, and I think that would also be satisfied by
having those dense_vector_base and dense_matrix_base classes.

Explaining myself: I have worked with CG and 3D geometry mostly from 2000
to 2005. Back then I used my own vector<scalar_type,N> and
array<scalar_type,M,N> classes, which were simple, there was no inheritance
and no expression templates. One thing I had though, is that I would define
a specialization of these for vector<float,4> and array<float,4,4> that
would use a 16-byte aligned XMM register type and specialization for all
operations, resulting in vectorized SSE1 code which was as good as
hand-coded assembly (10-20X performance increase compared to the generic
code).

>From 2006 until now, I have worked with more general geometry and have
translated lots of MATLAB code to C++, so I have been using uBLAS and
lapack which are much better, more general and don't want to go back.
Compilers have started to do automatic vectorization of code, so I left
that up to the compiler and forgot about it (or never had the time to look
into it).

However, I recently had the time again and looked into the SSE vectorized
code that gcc generates. Obviously I see that the code which is generated
for c_vector and c_matrix isn't better in any way than the code generated
for matrix and
vector and that it is worse than the code generated for float[M][N]. And
the reason for it being worse is exactly because the vector is dynamically
sized, not statically. All cases are vectorized, but the code is much
smaller and
simpler if the size is known at compile time, because there are less
possibilities.

Now, I would still want to define another fixed_vector and fixed_matrix
for myself (maybe this could also be of use to others, I don't know),
because the code generated even for float[M][N] is still a lot worse than
the code I had
with my 2000 to 2005 library. The difference is that even for float[M][N]
the compiler doesn't know if the
data is properly aligned for SSE and that he does not keep intermediate
results in XMM registers. Typically, one sees the compiler using only xmm0
and xmm1 and none of the other 6 registers. With my old library, a lot more
registers would be used.

Now, I could get the same code quality of my old library if I specialized
fixed_vector and fixed_matrix, however, their sizeof would change and their
alignment requirements too. So, a better approach would be to define new
classes called vector_register and matrix_register, 100% equal to
fixed_vector and fixed_matrix in the general case, with the mentioned
specializations for the particular cases, and the difference would be that
the programmer wouldn't relie on their sizeof or store them in external
storage, etc. They would only be usable for intermediate results and
function arguments.

I would also be glad to contribute those vector_register and
matrix_register classes if I am able to do it now with expression types...
I would also point out for people working with 3D geometry out there that
if a fixed_vector/ fixed_matrix is available, and therefore the compiler
vectorization improves a little, then it is unlikely that optimizations
concerning homogenous coordinates are worthwhile, they will probably
generate worse code (I've seen it before).

And also my two cents in response to Nasos Iliopoulos,

> From: Nasos Iliopoulos <nasos_i_at_[hidden]>
> Date: Wed, 4 Aug 2010 10:34:09 -0400
> Subject: Re: [ublas] fixed size vector in boost::numeric::ublas?
>
> Me neither=2C I don't see any benefits (size and performance wise) using
th=
> e bounded types. Unless we are talking about some exotically
specialised r=
> equirements I don't find they should be around. If the fixed containers
are=
> in I also believe that the new documentation should spend the least
amount=
> of lines on them as they are very confusing to the beginners. What do
othe=
> r people think?

I think bounded types have a lot of uses even if they don't generate
similar code. They may be of use in a scenario where memory fragmentation
is very relevant, yielding some performance gains. They are PODs and are
much more readily usable in scenarios where serialization is involved.

It's true they may confuse the user, but the strategy of spending the
least amount of lines of documentation on them seems bad.
Frankly, I'm tired of finding out features or looking on how to use
something by looking into the header files of uBLAS.
It's not difficult to include a paragraph saying "This vector type is
intended for usage in very specialized scenarios. If in doubt use vector."

> > If the intent is to deprecate the existing c_vector and c_matrix
implemen=
> tation
> > and rename it if necessary=2C then that sounds great to me.
>
> This is certainly a good option:
> 0. Deprecate c_vector and c_matrix for some time.
> 1. Copy their implementation into fixed_xxxx.
> 2. remove size member.=20
> 4. Merge Markus implementation in this implementation.=20
> 5. Reiterate the issues with size checking.

I agree on all points, and would vote on the sharing code with
vector/matrix through adding an intermediate class between *_container and
*, which would also aid the user in adding more vector/matrix types of his
own.

Greetings to all,

-- 
Miguel Ramos <miguel_at_[hidden]>
PGP A006A14C