Boost logo

Boost :

From: terekhov_at_[hidden]
Date: 2001-03-15 18:23:15


--- In boost_at_y..., williamkempf_at_h... wrote:
[...]

> It would be even
> more beneficial if some other group had evaluated your solutions
> prior to this groups evaluation (bearing in mind that this group is
> full of C++ experts but has few people that are _truly_ "expert" in
> multi-threaded programming).

sorry, but this is not me how is/was asking for review/evaluation
here (critique with respect to your _threads_ library). in response
to your request i've just made a claim that your CV impl. is
incorrect and pointed to materials so that you and anybody else
(who is interested) could make his own decision with respect to
my claim (is valid or not). later _you_ requested more info from
me and now you say to me that "You're not being too helpful here" ???
well..

> You stated at one point that
> there was "a proper implementation for Win32 available"
[...]
> Firstly, your solutions are "available" only in the
> sense that you posted them to the Usenet.

please check my messages again. I never said that these are
"my" solutions (in fact they are a product of joint effort)
nor did I ever claim that these are "correct" solutions.
I said _hopefully_ correct.

as for evidence/reviews.. here is some info:

From: Mike Haller_at_IBMUS on 03/13/2001 01:40 AM
Subject: Win32 condition variables

Alexander,

I have read your postings on the problems with Doug Schmidt's
approach to condition variables on Win32. I found his code
while looking for a strategy to implement condition variables
on win32, and noticed the same problems while reading the code.

You made a request in your post for reviews of correctness
and effeciency of your code, which I will happily oblige.
I am wondering if you would be interested in reviewing my
implementation as well? It seems a bit simpler that what you
are proposing (it only requires one event and some
manipulation with thread handles), but is restricted to
NT & 2000 because of dependence on SignalObjectAndWait(). Let
me know if you are interested and I will send you the code.

Regards,
Mike Haller

[Mike's impl...

//--------------------------------------------------------------------
--------
// This implementation is unduly complicated due to the fact that
Windows does
// not have a synchronization object with characteristics similar to
that of
// a condition variable. Note that the widely avaialable and copied
// implementation of condition variables for Win32 described by
Douglas C.
// Schmidt (http://www.cs.wustl.edu/~schmidt/win32-cv-1.html) has
fatal flaws,
// so if there are bugs in this version, do not attepmt to replace it
with
// that implementation.
//
]

Please respond to Ross Johnson <rpj_at_[hidden]>
Subject: Re: win32 conditions: sem+counter+event = broadcast_deadlock
      +spur.wakeup/unfairness/incorrectness ??

 Hi,

 Your message is much appreciated.

 I'm just back from vacation and am looking through the
 discussion and patch. Expect to see your fix in the next
 snapshot which I hope to get out soonish.

 Cheers.
 Ross

 TEREKHOV_at_[hidden] wrote:
>
> fyi.. (more detailed problem description/demos + possible
fix/patch)
>
> regards,
> alexander.
>
> Alexander Terekhov
> 31.01.2001 17:43

 - --
 +----------------------+---+
 | Ross Johnson | | E-Mail: rpj_at_[hidden]
 | Info Sciences and Eng|___|
 | University of Canberra | FAX: +61 6 2015227
 | PO Box 1 |
 | Belconnen ACT 2616 | WWW:
http://willow.canberra.edu.au/~rpj/
 | AUSTRALIA |
 +--------------------------+

Please respond to "Nanbor Wang" <nanbor_at_[hidden]>
cc: "Louis Thomas" <lthomas_at_[hidden]>,
rpj_at_[hidden],
"Douglas C. Schmidt" <schmidt_at_[hidden]>
Subject: Re: Win32 Conditions

Hi Alexander,

Thank you very much for keeping us informed on your finding and
providing
us detailed information on your research. I am the current maintainer
of CV implementation in ACE. I will look over the information over
the
weekend and verify/deny/comment your finding next week (I am not
ignoring
you :-) However, since we are in the process of preparing a beta
release,
changes won't reflect on our release immediately.

Thanks,

nanbor

Please respond to Louis Thomas <lthomas_at_[hidden]>
To: rpj_at_[hidden], Thomas Pfaff <tpfaff_at_[hidden]>,
Nanbor Wang <nanbor_at_[hidden]>
cc:
Subject: RE: Win32 Conditions

In this message, I am going to present 3 algorithms and discussion
about
different approaches. I believe that one of these algorithms is the
best so
far, so we may not want to create a new PThreads release until people
have
had a chance to critique what I am going to present here.

I have studied Alexander's algorithm, and I believe it is correct.
However,
I believe it has two shortcomings which can be overcome. One
shortcoming is
that threads that time out waiting on the semaphore can induce
spurious
wakeups later. This is easy to fix. The other shortcoming is that
Alexander's locking is too strict. It blocks threads attempting to do
signal/broadcast that don't really need to be blocked. I will get to
this,
but I think it's best to start at the beginning.

There are many ways to implement condition variables in Win32. Some
ways,
such as having an event associated with each waiting thread, are not
even
being considered because they are too globally invasive. We have been
concentrating on implementations with a small fixed number of Win32
sync
objects per condition. One class of implementations has the threads
wait on
one or more event objects. Schmidt & Pyarali list many of these in
their
paper "Strategies for Implementing POSIX Condition Variables on Win32"
[http://www.cs.wustl.edu/~schmidt/win32-cv-1.html], and, as they
indicate,
none of them are correct. However, there is one that they did not
mention
that is interesting because it is correct. Here it is in pseudocode
form:
---------- Algorithm 1 ----------
given:
ex_mutex - mutex
cond - autoreset event
nWaiters - int

wait(timeout) {
  nWaiters++
  bTimedOut=SignalAndWait(ex_mutex, cond, timeout)
  lock ex_mutex
  nWaiters--
  return bTimedOut
}

signal(bAll) {
  do (bAll?nWaiters:1 times) {
    PulseEvent(cond)
  }
}
---------- ---------- ----------
This pseudocode assumes that ex_mutex is always held by the thread
calling
signal, but is easily corrected by adding a CritSec or mutex to
protect
nWaiters. I think that this algorithm meets all the requirements
specified
by PThreads. There is no signal stealing, double broadcast deadlocks,
or
spurious wake ups. (There may be spurious PulseEvents but they do not
cause
wake ups.) It has the nice property that it requires the same number
of sync
objects as the Unix version. However, it has one major drawback that
is not
even mentioned in the PThread spec. In Win32, a thread that is in the
'suspended' state is not in the 'waiting' state. This means that if a
thread
is waiting on an auto-reset event but is suspended when the PulseEvent
occurs, it will miss the notification because it was not really
waiting on
the event. Debuggers suspend and resume threads all the time. This
means
that single stepping a program while a signal occurs can very easily
cause
any number of wake ups to be missed and deadlock to occur. Because of
this
problem, this implementation is basically unusable. PThreads does not
even
say whether or not threads can be suspended, let alone how this should
affect threads waiting on a condition.

The next class of implementations uses a semaphore to wake up waiting
threads. A semaphore gives us control over exactly how many threads
are
woken up, but we have no control over how long it takes them to wake
up.
This makes semaphore implementations susceptible to signal stealing.
To
prevent this, we have a new rule that our implementations must obey
in order
to be correct:
    A thread may not begin a wait on the semaphore until it is
guaranteed
not to steal a signal.
The "SignalObjectAndWait Solution" of Schmidt & Pyarali does not
follow this
rule and is not correct. The current implementations in PThreads-
Win32 and
ACE use this solution and are not correct, as Alexander has
demonstrated.

How can we guarantee that we follow this rule? We need some kind of
gate. A
thread must pass through the gate to wait on our semaphore. We can
close the
gate after we do a signal, and the last thread to wake up can open
the gate
again. We must be able to wait on the gate (if we try to pass and it
is
closed) and we must be able to close the gate in one thread and open
it in
another. The Win32 sync objects that meet these requirements are
manual-reset events and semaphores. We have one more requirement: We
must
know at all times exactly how many threads are in the waiting state.
This
means that passing through the gate and incrementing the number of
waiting
threads must be atomic.

We can do this pretty easily using a manual reset event, if the
object that
protects nWaiters is a mutex.
---------- Algorithm 2 ----------
given:
gate - manual event
queue - semaphore
counters - mutex
ex_mutex - CS or mutex
nWaiters - int
nSignaled - int

wait(timeout) {
  waitmultiple(gate, counters, both) // pass gate
  nWaiters++
  unlock counters
  unlock ex_mutex
  bTimedOut=sem_wait(queue, timeout)
  lock counters
  if (bTimedOut && nWaiters>0) {
    nWaiters--
  } else {
    if (bTimedOut) {
      sem_wait queue // better now than spurious later
      bTimedout=false
    }
    nSignaled--
    if (0==nSignaled) {
      set gate // open gate
    }
  }
  unlock counters
  lock ex_mutex
  return bTimedOut?ETIMEDOUT:0
}

signal(bAll) {
  lock counters
  if (0!=nWaiters) {
    reset gate // close gate
    if (!bAll) {
      sema_post queue
      nSignaled++
      nWaiters--
    } else {
      sema_post (queue, nWaiters)
      nSignaled=nWaiters;
      nWaiters=0;
    }
  }
  unlock counters
}
---------- ---------- ----------
This implementation has no signal stealing, double broadcast
deadlocks, or
spurious wake ups. It is nice and straight forward. On the other
hand, it
requires 'counters' to be a mutex and not a critical section, which
means a
speed hit. Can we do better? The big problem is the last requirement
mentioned: passing through the gate and incrementing the number of
waiting
threads must be atomic. If we use a critical section, we can't do the
WaitForMultipleObjects that satisfied that atomicity requirements.

Now is a good time to look at Alexander's implementation. I present
it here
in pseudocode I extracted from his PThread patch. My apologies if I
missed
something.
---------- Alexander's Algorithm ----------
given:
BlockLock - semaphore
BlockQueue - semaphore
ex_mutex - mutex or CS
UnblockLock - mutex or CS
nWaitersBlocked - int
nWaitersToUnblock - int
nWaitersUnblocked - int

wait(timeout) {

    sema_wait BlockLock
    nWaitersBlocked++
    sema_post BlockLock

    unlock ex_mutex
    bTimedOut=sema_wait(BlockQueue, timeout)

    lock UnblockLock
    nWaitersUnblocked++
    if (0==nWaitersToUnblock) {
        lastsignal=-1
    } else {
        nWaitersToUnblock--
        if (0==nWaitersToUnblock) {
            lastsignal=1
        } else {
            lastsignal=0
        }
    }
    unlock UnblockLock

    if (1==lastsignal) {
        sema_post BlockLock
    }

    lock ex_mutex;
    return bTimedOut?ETIMEOUT:0
}

signal(bAll) {

    sema_wait BlockLock
    lock UnblockLock
    if (nWaitersUnblocked!=0) {
        nWaitersBlocked-=nWaitersUnblocked
        nWaitersBlocked=0
    }
    if (nWaitersBlocked>0) {
        toUnblock=
            nWaitersToUnblock=
                bAll?nWaitersBlocked:1
        unlock UnblockLock
        sema_post(BlockQueue, toUnblock)
    } else {
        sema_post(BlockLock)
        unlock UnblockLock
    }
}
---------- ---------- ----------
As you can see, Alexander has probably followed reasoning similar to
my own,
and is using the BlockLock semaphore to ensure that a thread may not
begin a
wait on the BlockQueue semaphore until it is guaranteed not to steal a
signal. This implementation has no signal stealing or double broadcast
deadlocks. One shortcoming is that threads that time out waiting on
the
semaphore can induce spurious wakeups later. Again, this is easy to
fix and
is not really important. The other shortcoming is that Alexander's
locking
is too strict. It blocks threads attempting to do signal/broadcast
that
don't really need to be blocked. For example, assume there are three
threads
waiting on a condition, and a different thread
does "pthread_cond_signal();
pthread_cond_broadcast();". The broadcast will block until the signal
completes, although it should be able to execute 'concurrently'. Our
rule
states that a thread may not begin a wait on the BlockQueue semaphore
while
the 'signal' is processing. The rules does NOT prevent us from
releasing
more threads, so we should do so. In fact, when trying to do
"pthread_cond_broadcast(); pthread_cond_signal();" this algorithm
will block
on the signal even though the signal is probably a no-op!

Can we relax the blocking and make it more like the Algorithm 2?
There are
two problems: One is maintaining the atomicity of passing the gate an
incrementing nWaiters. (Note that by atomically incrementing nWaiters
I
really mean locking 'counters'.) In order to allow critical sections,
we
have to do this in two calls. The other problem is that closing the
gate
when the gate is a semaphore is a blocking call (sema_wait). We can't
pass
the gate while holding the 'counters' lock, because we will deadlock
when
the gate is closed. We can't close the gate while holding
the 'counters'
lock, because we will deadlock when the gate is being passed.
* If we inc nWaiters before passing the gate, we could close the gate
and
think we have more waiters than we do, and thus deadlock waiting for
the
last waiter.
* If we inc nWaiters after passing the gate, we will think we have
less
waiters than we do, which means some won't be signaled. This is could
be OK,
since we could claim we did a signal before those threads were really
waiters, but we can't guarantee which threads will not wake up. We
will have
signal stealing, which is incorrect.
Note that this is a race condition that only happens when one thread
is
trying to do a wait while another thread that did not lock the
ex_mutex
tries to do a signal at the same time. If we always hold the
ex_mutex, we
never run in to this situation.
It turns out that in the second case, we can detect this situation
and deal
with it! Then we can maintain the no stealing rule and we win. Here
is the
code. Detecting the race condition makes it a bit messier.
---------- Algorithm 3 ----------
given:
gate - semaphore
queue - semaphore
counters - CS or mutex
ex_mutex - CS or mutex
nWaiters - int
nSignaled - int
bSigAll - bool // could be sign bit of nSignaled

wait(timeout) {
  sem_wait gate // pass gate
  lock counters
  if (0==nSignaled) {
    // normal
    nWaiters++
    sem_post gate
  } else {
    // oh no - in the middle of a signalOne or signalAll - gate stays
shut
    if (!bSigAll) {
      // we don't have to wake up. Be a waiter and see what happens
      nWaiters++
    } else {
      // we have to wake up. We could just return - see discussion
below
      nSignaled++;
      sem_post queue
    }
  }
  unlock counters
  unlock ex_mutex
  bTimedOut=sem_wait(queue, timeout)
  lock counters
  if (bTimedOut && nWaiters>0) {
    nWaiters--;
  } else {
    if (bTimedOut) {
      sem_wait queue // better now than spurious later
      bTimedout=false;
    }
    nSignaled--;
    if (0==nSignaled) {
      bSigAll=false;
      sem_post gate // open gate
    }
  }
  unlock counters
  lock ex_mutex
  return bTimedOut?ETIMEDOUT:0;
}

signal(bAll) {
  lock counters
  if (0!=nWaiters) {
    sema_wait(gate, 0) // close gate
    if (!bAll) {
      sema_post queue
      nSignaled++;
      nWaiters--;
    } else {
      bSigAll=true;
      sema_post (queue, nWaiters)
      nSignaled=nWaiters;
      nWaiters=0;
    }
  }
  unlock counters
}
---------- ---------- ----------
At one point we have the option of just returning. I think it is
probably
better not to. The PThreads spec says that a timeout will always
unlock then
lock ex_mutex. If we don't return immediately, but do full
processing, we
match this, and we don't have to do anything extra to handle
cancellation.

This implementation has no signal stealing, double broadcast
deadlocks, or
spurious wake ups. It requires four sync objects compared to Unix's
two, but
no one has been able to find a good way around this. It can utilize
CritSecs
for all the mutexes, which are faster than mutexes when there is not
much
contention (like for the 'counters' lock). It does not block any
signal()
calls that can immediately succeed. It only blocks wait() calls that
can't
proceed. It behave correctly even when threads are suspended and
resumed
while it is executing, assuming threads are not suspended
indefinitely.

I think this makes Algorithm 3 the best available algorithm. What do
you
think? Are there any bugs that I missed?

        Later,
                -Louis! :)


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk