Boost logo

Boost :

From: Markus Schöpflin (markus.schoepflin_at_[hidden])
Date: 2007-10-19 10:40:40


Boris Gubenko wrote:
> Markus Schoepflin wrote:

[...]

>> Do you think I should change it to use __ATOMIC_EXCH_LONG?
>
> I do. To see the difference, you can compare the code generated for foo()
> and bar() in x.cxx below. Note ldl_l/stl_c in bar().
>
> x.cxx
> -----
> #include <machine/builtins.h>
> void foo (volatile int *mem, int val) { *mem = val; }

                   .globl __7foo__FPVii
                   .ent __7foo__FPVii
0000 __7foo__FPVii:
                   .frame $sp, 0, $26
                   .prologue 0
                   .context full
0000 trapb
0004 stl val, (r16)
0008 ret (r26)
                    .end __7foo__FPVii

Here the compiler generates a trap barrier followed by a store instruction.
  As of chapter 5.2.2 of the Alpha architecture handbook, the access is
guaranteed to be performed in a single atomic operation.

> void bar (volatile int *mem, int val) { __ATOMIC_EXCH_LONG(mem, val); }

                    .globl __7bar__FPVii
                    .ent __7bar__FPVii
0010 __7bar__FPVii:
                    .frame $sp, 0, $26
                    .prologue 0
0010 L$2:
                    .context full
0010 mov val, r0
0014 ldl_l r1, (r16)
0018 stl_c r0, (r16)
001C unop
0020 beq r0, L$2
0024 ret (r26)

Here the compiler generates a 'load locked' and 'store conditionally'
sequence, wrapped by a loop repeated until the load/store has succeeded. I
don't see why this should give me any advantage over the previous, when all
I want is an atomic store, and I am not interested in the previous value.
Could you please tell me?

Also, I now have two more questions, which you can probably answer:

1) Why is a trap barrier created in the first case, but not in the second?

2) According to the Alpha architecture handbook, branch prediction predicts
backward branches to be taken, and it is recommended not to implement the
load/store like above. (See documentation for STx_C, chapter 4.2.5.) Is
this no longer true?

Thank you for your help,
Markus


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk