Boost logo

Ublas :

Subject: Re: [ublas] Matrix multiplication performance
From: Michael Lehn (michael.lehn_at_[hidden])
Date: 2016-01-27 20:31:31


On 28 Jan 2016, at 01:04, Joaquim Duran Comas <jdurancomas_at_[hidden]> wrote:

> If explicit simd should not be used, by now, then you should help the compiler to generate more optimized code, by aligning properly the buffers.
>
> There is the boost.align library, which provides an aligned allocator (http://www.boost.org/doc/libs/1_60_0/doc/html/align.html)

That good to know. Functions aligned_alloc and aligned_free can replace the functions

void *
malloc_(std::size_t alignment, std::size_t size)
{
    alignment = std::max(alignment, alignof(void *));
    size += alignment;

    void *ptr = std::malloc(size);
    void *ptr2 = (void *)(((uintptr_t)ptr + alignment) & ~(alignment-1));
    void **vp = (void**) ptr2 - 1;
    *vp = ptr;
    return ptr2;
}

void
free_(void *ptr)
{
    std::free(*((void**)ptr-1));
}

This really should have gone into C++11