Boost logo

Boost :

Subject: Re: [boost] Review request for Boost.Align
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2014-02-20 09:51:40


On Thu, Feb 20, 2014 at 5:35 PM, Peter Dimov <lists_at_[hidden]> wrote:
> DUPUIS Etienne wrote:
>>
>> I second Andrey, whatever the type T is, it would be nice to have an easy
>> way of specifying a stronger alignment constraint. This is particularly
>> useful to allocate for example buffers of uint8_t on 16-byte or 64-byte
>> boundaries for performance reasons.
>
>
> You're perhaps missing a certain subtlety here and it is that the
> container/algorithm taking the allocator<T> may allocate objects other than
> T. It does this via rebind<>. And the question then becomes, do you want
> this 64 byte boundary to also apply to these additional allocations? The
> answer is often 'no' - you don't want deque<T>'s bookkeeping structures to
> be overaligned - but sometimes it's 'yes', if you passed allocator<float>
> but the function actually uses allocator<char>. And sometimes, as with
> list<T>, the answer is non-binary.

Yes, that is true. However, when I specify alignment in the allocator,
what I explicitly care about is alignment of the elements. For the
most part I don't care what alignment the container uses for its
internal structures, although I realize that this probably affects
memory overhead. This can be perceived as a shortcoming of the current
containers interface - you can only specify one allocator, the one
that is "supposed" to be used for the elements, and the container uses
it for other purposes as well behind your back. Luckily, most
containers only allocate structures that embed elements, so the
alignment is justified. One notable exception is unordered containers
- these would have to also allocate the bucket list, which need not be
aligned as strict as the elements.

> If the required alignment is equal to alignof(T), it all works - structures
> having T as a member will automatically receive an alignment at least as
> strict, and functions using allocator<char> instead of allocator<T> will
> take the necessary steps to std::align the resulting pointer at alignof(T).

Yes.

> So if you allocate T = struct { char[64]; } alignas(64), it would avoid all
> these complications. This depends on the compiler providing a proper support
> for overaligned types. But that's needed for __m128 and __m256, so I'd
> expect it to be there.

I think, current implementations (at least, those I have worked with)
don't support this. I.e. if you do std::allocator< __m128
>::allocate(n), you're basically doing new __m128[n], and this does
not align memory to 16 bytes. Well, it does on x86_64 Linux/Windows/OS
X but simply because all memory allocations are 16-byte aligned on
that architecture and not because alignas(__m128) == 16. I think, with
C++03 this was justified by the fact that __m128 has non-standard
alignment, which is not covered by the standard. Not sure if this
changed in C++11 with introduction of alignas.

> Not that aligned_allocator<T, A> would not be useful; it would be, as long
> as it works. There's no guarantee that it will in all cases though.

Exactly. Basically, when you use the aligned_allocator< __m128 > with
a container, you have no guarantee that aligned_allocator< __m128 >
will actually be used to allocate memory. I.e. if you want to allocate
__m128 elements aligned to 64 bytes and specify:

   template< typename T > struct my_alignment_of : alignment_of< T > {};
   template< > struct my_alignment_of< __m128 > : mpl::int_< 64 > {};

   std::list< __m128i, aligned_allocator< __m128, my_alignment_of > > d;

this will likely not work because std::list won't use my_alignment_of<
__m128 > but instead some my_alignment_of< list_node< __m128 > >. To
make this work you'd have to write some fake metafunction that just
always returns 64, and this is equivalent to just specifying 64 in
aligned_allocator template parameters, only more complicated.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk