|
Ublas : |
Subject: Re: [ublas] SSE support
From: Nasos Iliopoulos (nasos_i_at_[hidden])
Date: 2011-03-28 11:32:33
Hello all,
I think that introducing OpenMP should be fairly straightforward, by tweaking the dispatching loops in functional.hpp functors (and wherever else loops exist) to use #pragma omp fors. (instead of while or simple for loops). I wish I had time to implement it (I may attempt to post a test though).
As far as SSE is concerned, there may be an easy way with auto-vectorization, but the hard part would be to get the compiler unwind uBlas expression template structure. In hand-made tests I have seen auto-vectorization and openMP working nicely on uBlas containers and giving performance better the Eigen3 (basically because Eigen3 is not using OpenMP efficiently atm). I may post those results in the near future along with some source code. Unfortunately the whole auto-vectorization feature is very compiler specific and I find the gcc is easier to get it running than MSVC or icpc. Another issue with auto-vectorization is the type alignment. For more info look at:
http://gcc.gnu.org/projects/tree-ssa/vectorization.html
Furthermore I have seen some attempts in functional.hpp to probably enable SIMD auto-vectorization by providing compiler friendly syntax, (check the BOOST_UBLAS_USE_SIMD define), but I never tried to see how this works or if it actually boosts performance.
FInally I would agree with David than more permanent solutions would probably be Boost::SIMD.
Best,
Nasos
On Mar 24, 2011, at 6:14 AM, David Bellot wrote:
> not for the moment.
> In fact, there is a GSoC project for porting part of NT2 to Boost. It will be something like Boost::SIMD I think.
> So I think that would be better to integrate this future library into ublas rather than having our own implementation.
> The reason is that those guys at NT2 already have a rock-solid vector implementations running on multiple architectures (SSE, Altivec, Cell processor and I think ARM). So the benefit would be immediate for us.
>
> However, I you plan to do something with OpenMP, that would be great. Eigen has it (or some sort of multi-core capabilities).
> We need that too. I would start by looking at the GNU parallel STL implementation to have an idea.
>
> Any suggestions ?
>
> Cheers,
> David
> ____________________
> David Bellot, PhD
> http://david.bellot.free.fr
> http://ai-owl.blogspot.com
>
>
>
> On Thu, Mar 24, 2011 at 10:39, Philipp Kraus <philipp.kraus_at_[hidden]> wrote:
> Hello,
>
> does the ublas components use a SSE (streaming SIMD Extension) ? If yes, which version? Or must / should I create a own support class?
>
>
> Thanks
>
> Phil
>
>
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/ublas
> Sent to: david.bellot_at_[hidden]
>
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/ublas
> Sent to: athanasios.iliopoulos.ctr.gr_at_[hidden]