Boost logo

Boost :

From: Alexander Terekhov (terekhov_at_[hidden])
Date: 2005-04-09 13:28:14


Alexander Terekhov wrote:
>
> Peter Dimov wrote:
> [...]
> > Are you sure that keeping a reservation alive for extended periods of time
> > does not incur a performance penalty, BTW?
>
> I've never heard that it can.
>
> > Could this be the reason for the preliminary technical note?
>
> I'm also puzzled.

Here's a snippet from the official "Programming Environments Manual"
referenced by that puzzling preliminary notice itself. It explicitly
mentions dangling lwarxs, says that it is good for forward progress,
and shows yet another dangling lwarx in "better performance may be
obtained" illustration.

<quote>

E.1 General Information

The following points provide general information about the lwarx
and stwcx. instructions:

- It is acceptable to execute an lwarx instruction for which no
  stwcx. instruction is executed. Such a dangling lwarx instruction
  occurs in the example shown in Section E.2.5 , "Test and Set," if
  the value loaded is not zero.

- To increase the likelihood that forward progress is made, it is
  important that looping on lwarx/stwcx. pairs be minimized. For
  example, in the sequence shown in Section E.2.5 , "Test and Set,"
  this is achieved by testing the old value before attempting the
  store -- were the order reversed, more stwcx. instructions might
  be executed, and reservations might more often be lost between
  the lwarx and the stwcx. instructions.

- The manner in which lwarx and stwcx. are communicated to other
  processors and mechanisms, and between levels of the memory
  subsystem within a given processor, is implementation-dependent.
  In some implementations, performance may be improved by minimizing
  looping on an lwarx instruction that fails to return a desired
  value. For example, in the example provided in Section E.2.5 ,
  "Test and Set," if the program stays in the loop until the word
  loaded is zero, the programmer can change the "bne- $+12" to "bne-
  loop."

  In some implementations, better performance may be obtained by
  using an ordinary load instruction to do the initial checking of
  the value, as follows:

loop: lwz r5,0(r3) #load the word
      cmpwi r5,0 #loop back if word
      bne- loop #not equal to 0
      lwarx r5,0,r3 #try again, reserving
      cmpwi r5,0 #(likely to succeed)
      bne loop #try to store nonzero
      stwcx. r4,0,r3 #
      bne- loop #loop if lost reservation

[...]

E.2.5 Test and Set

This version of the test and set primitive atomically loads a word
from memory, ensures that the word in memory is a nonzero value,
and sets CR0[EQ] according to whether the value loaded is zero.

In this example, it is assumed that the address of the word to be
tested is in GPR3, the new value (nonzero) is in GPR4, and the old
value is returned in GPR5.

loop: lwarx r5,0,r3 #load and reserve
      cmpwi r5, 0 #done if word
      bne $+12 #not equal to 0
      stwcx. r4,0,r3 #try to store non-zero
      bne- loop #loop if lost reservation

</quote>

regards,
alexander.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk