|
Boost : |
From: Alexander Terekhov (terekhov_at_[hidden])
Date: 2005-04-09 13:28:14
Alexander Terekhov wrote:
>
> Peter Dimov wrote:
> [...]
> > Are you sure that keeping a reservation alive for extended periods of time
> > does not incur a performance penalty, BTW?
>
> I've never heard that it can.
>
> > Could this be the reason for the preliminary technical note?
>
> I'm also puzzled.
Here's a snippet from the official "Programming Environments Manual"
referenced by that puzzling preliminary notice itself. It explicitly
mentions dangling lwarxs, says that it is good for forward progress,
and shows yet another dangling lwarx in "better performance may be
obtained" illustration.
<quote>
E.1 General Information
The following points provide general information about the lwarx
and stwcx. instructions:
- It is acceptable to execute an lwarx instruction for which no
stwcx. instruction is executed. Such a dangling lwarx instruction
occurs in the example shown in Section E.2.5 , "Test and Set," if
the value loaded is not zero.
- To increase the likelihood that forward progress is made, it is
important that looping on lwarx/stwcx. pairs be minimized. For
example, in the sequence shown in Section E.2.5 , "Test and Set,"
this is achieved by testing the old value before attempting the
store -- were the order reversed, more stwcx. instructions might
be executed, and reservations might more often be lost between
the lwarx and the stwcx. instructions.
- The manner in which lwarx and stwcx. are communicated to other
processors and mechanisms, and between levels of the memory
subsystem within a given processor, is implementation-dependent.
In some implementations, performance may be improved by minimizing
looping on an lwarx instruction that fails to return a desired
value. For example, in the example provided in Section E.2.5 ,
"Test and Set," if the program stays in the loop until the word
loaded is zero, the programmer can change the "bne- $+12" to "bne-
loop."
In some implementations, better performance may be obtained by
using an ordinary load instruction to do the initial checking of
the value, as follows:
loop: lwz r5,0(r3) #load the word
cmpwi r5,0 #loop back if word
bne- loop #not equal to 0
lwarx r5,0,r3 #try again, reserving
cmpwi r5,0 #(likely to succeed)
bne loop #try to store nonzero
stwcx. r4,0,r3 #
bne- loop #loop if lost reservation
[...]
E.2.5 Test and Set
This version of the test and set primitive atomically loads a word
from memory, ensures that the word in memory is a nonzero value,
and sets CR0[EQ] according to whether the value loaded is zero.
In this example, it is assumed that the address of the word to be
tested is in GPR3, the new value (nonzero) is in GPR4, and the old
value is returned in GPR5.
loop: lwarx r5,0,r3 #load and reserve
cmpwi r5, 0 #done if word
bne $+12 #not equal to 0
stwcx. r4,0,r3 #try to store non-zero
bne- loop #loop if lost reservation
</quote>
regards,
alexander.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk