|
Boost : |
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2020-06-16 15:59:18
On 2020-06-16 18:04, Andrey Semashev wrote:
> On 2020-06-16 17:43, Hans Dembinski wrote:
>>> On 16. Jun 2020, at 16:41, Hans Dembinski <hans.dembinski_at_[hidden]>
>>> wrote:
>>>
>>>>
>>>> On 16. Jun 2020, at 16:32, Andrey Semashev via Boost
>>>> <boost_at_[hidden]> wrote:
>>>>
>>>> On 2020-06-16 16:45, Hans Dembinski via Boost wrote:
>>>>>> On 16. Jun 2020, at 14:53, Niall Douglas via Boost
>>>>>> <boost_at_[hidden]> wrote:
>>>>>>
>>>>>> On 15/06/2020 17:19, Phil Endecott via Boost wrote:
>>>>>>
>>>>>>> Are the only affected files the SIMD implementation, (c) Robert N
>>>>>>> Steagall? If so, can this be disabled (by default?) by the user
>>>>>>> to avoid the copyright notice requirement?
>>>>>>
>>>>>> If that's *the* Bob Steagall I'm almost certain he'll relicence
>>>>>> his code
>>>>>> under Boost if you ask him.
>>>>> I think a multi-platform library like Boost should avoid using SIMD
>>>>> intrinsics and write the code so that the auto-vectorisers of the
>>>>> compilers understand it - if that is possible. I don't claim it is
>>>>> possible here, but I have a lot of trust in the auto-vectorisers of
>>>>> gcc and clang. Godbolt is very helpful in designing code that can
>>>>> be auto-vectorised well.
>>>>
>>>> In my experience, auto-vectorization is useless in all but trivial
>>>> cases.
>>>
>>> Care to give me an example?
>>
> > I don't know what you consider trivial, but these examples here
> listed under "vectorizable" do not look trivial to me.
> >
> > https://gcc.gnu.org/projects/tree-ssa/vectorization.html
>
> The loops I see there are trivial. Any loop where you have two arrays
> and you apply an operation OP between elements of the arrays is trivial
> and actually not very often met in practice.
>
> In practice you often have:
>
> - Conditional expressions in loop body.
> - Non-trivial loop exit conditions.
> - Horizontal operations between elements of the same array.
> - Array elements more dense than int. (Auto-vectorizers tend to widen
> elements as per C++ promotion rules.)
> - Input data provided via pointers that may theoretically overlap.
> (Auto-vectorizers tend to be pessimistic about data overlap, unless
> persuaded by non-portable means.)
>
> Any of the above is typically poorly handled by auto-vectorizers, if at
> all. At least, that was my observation when I wrote code that I
> subsequently had to manually vectorize.
Here's a recent example of a loop that I wish would be vectorized:
https://github.com/boostorg/atomic/blob/develop/src/lock_pool.cpp#L179-L186
https://gcc.godbolt.org/z/CX3kDb
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk