UmutHi Nasos,

a partial answer to the items of your question:

The main performance difference between eigen and ublas is that eigen performs explicit vectorization: the operations are put on special registers that allow for the execution of multiple instructions per cycle (SIMD).

Modern cpus have 128 bits per register, allowing 4 floats or 2 doubles to be stored (and operated upon) simultaneously. This means that the performance gain you have for double precision (If you are using a linear algebra system for anything other than programming graphics, probably you want to be using double precision), is at most about 2. That is what is expected from eigen over uBlas at most, but this is not the full story.

Thanks for the detailed explanation, very nice pointer to gcc pages

What are they, could you share them with me as far as you know. For the moment, what I would like to do is to test my prototype codes, these are basically in MATLAB or Octave, in a fast language, where I can check the real cpu time on large models, I have had experience with uBlas and digged a lot through the mailing list to find some important undocumented information, I would not like to do the same with eigen, that is the reason I am cautious on the topic. As far as I can see sparse module of eigen is also not that mature and you even have to define a preprocessor macro to use the sparse module to circumvent this problem.

Please also be aware that there are things that can be done to improve the performance of certain uBlas algorithms that are not optimal at the moment.

I am guessing that ublas is not that drastically slow in comparison to eigen for, especially large matrix sizes, because the benchmarks on eigen page are for relatively lower size of matrix-vector products, however I will give a test myself shortly.

Thanks and best regards,

Umut