From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2020-06-16 15:04:44
On 2020-06-16 17:43, Hans Dembinski wrote:
>> On 16. Jun 2020, at 16:41, Hans Dembinski <hans.dembinski_at_[hidden]> wrote:
>>> On 16. Jun 2020, at 16:32, Andrey Semashev via Boost <boost_at_[hidden]> wrote:
>>> On 2020-06-16 16:45, Hans Dembinski via Boost wrote:
>>>>> On 16. Jun 2020, at 14:53, Niall Douglas via Boost <boost_at_[hidden]> wrote:
>>>>> On 15/06/2020 17:19, Phil Endecott via Boost wrote:
>>>>>> Are the only affected files the SIMD implementation, (c) Robert N
>>>>>> Steagall? If so, can this be disabled (by default?) by the user
>>>>>> to avoid the copyright notice requirement?
>>>>> If that's *the* Bob Steagall I'm almost certain he'll relicence his code
>>>>> under Boost if you ask him.
>>>> I think a multi-platform library like Boost should avoid using SIMD intrinsics and write the code so that the auto-vectorisers of the compilers understand it - if that is possible. I don't claim it is possible here, but I have a lot of trust in the auto-vectorisers of gcc and clang. Godbolt is very helpful in designing code that can be auto-vectorised well.
>>> In my experience, auto-vectorization is useless in all but trivial cases.
>> Care to give me an example?
> I don't know what you consider trivial, but these examples here
listed under "vectorizable" do not look trivial to me.
The loops I see there are trivial. Any loop where you have two arrays
and you apply an operation OP between elements of the arrays is trivial
and actually not very often met in practice.
In practice you often have:
- Conditional expressions in loop body.
- Non-trivial loop exit conditions.
- Horizontal operations between elements of the same array.
- Array elements more dense than int. (Auto-vectorizers tend to widen
elements as per C++ promotion rules.)
- Input data provided via pointers that may theoretically overlap.
(Auto-vectorizers tend to be pessimistic about data overlap, unless
persuaded by non-portable means.)
Any of the above is typically poorly handled by auto-vectorizers, if at
all. At least, that was my observation when I wrote code that I
subsequently had to manually vectorize.
PS: I'm sending this to the ML.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk