Boost logo

Boost :

Subject: Re: [boost] [boostcon12]Trouble with tuples very compiler dependent
From: Larry Evans (cppljevans_at_[hidden])
Date: 2012-11-19 13:55:22

On 11/19/12 12:11, Eric Niebler wrote:
> On 11/19/2012 8:38 AM, Larry Evans wrote:
>> On 11/19/12 10:19, Larry Evans wrote:
>> [snip]
>>> In contrast to the gcc4_8 compiler, with the clangxx compiler, the
>>> relative qualitative performance, is just the opposite. IOW, the
>>> bcon12_horizontal implementation is faster than the bcon12_vertical
>>> implementation. In fact, the rate of change of the performance
>>> difference accelerates as tree depth goes from 2 to 4. The rate of
>>> change is so stark that it suggests, at least to me, there may be some
>>> bug in clang. Of course that conclusion is based on almost no
>>> knowledge, on my part, of the clang implementation.
>>> The can be modified to filter out other parts
>>> of the benchmark run output, which is here:
>> For example, when the filter criteria restricts
>> TUPLE_UNROLL_MAX to 10 (the same as TUPLE_SIZE),
>> then, with compiler=clangxx, bcon12_vertical
>> performs relatively better than bcon12_horizontal
>> as TREE_DEPTH increases, as shown in the attached.
> All the measured times are below one second. Benchmarks become more
> meaningful when the thing being measured takes more than a few seconds
> to finish. If you don't mind, can you step up the limits? Once we have a
> few data points in the tens of seconds and minutes, we'll have a better
> idea of how the different compilers are performing.
> Thanks for doing these. It's very interesting.
OK. I just changed TREE_DEPTH=2 to 7 by 1.
The run.txt and filt.txt files are attached.

AFAICT, vertical always does better than vertical no matter
what the compiler; however, the change is more dramatic for
clangxx. For gcc4_8 horizontal/vertical=about 2 at TREE_DEPTH=7.
OTOH, for clangxx, the ratio is about 16! I've no idea why.


Boost list run by bdawes at, gregod at, cpdaniel at, john at