|
Ublas : |
Subject: Re: [ublas] Matrix multiplication performance
From: Michael Lehn (michael.lehn_at_[hidden])
Date: 2016-01-27 20:31:31
On 28 Jan 2016, at 01:04, Joaquim Duran Comas <jdurancomas_at_[hidden]> wrote:
> If explicit simd should not be used, by now, then you should help the compiler to generate more optimized code, by aligning properly the buffers.
>
> There is the boost.align library, which provides an aligned allocator (http://www.boost.org/doc/libs/1_60_0/doc/html/align.html)
That good to know. Functions aligned_alloc and aligned_free can replace the functions
void *
malloc_(std::size_t alignment, std::size_t size)
{
alignment = std::max(alignment, alignof(void *));
size += alignment;
void *ptr = std::malloc(size);
void *ptr2 = (void *)(((uintptr_t)ptr + alignment) & ~(alignment-1));
void **vp = (void**) ptr2 - 1;
*vp = ptr;
return ptr2;
}
void
free_(void *ptr)
{
std::free(*((void**)ptr-1));
}
This really should have gone into C++11