Hmmm . . . actually, as I've been poking around, I think I've I found some more explanations elsewhere which, if I'm understanding them correctly, indicate that (a) I've misunderstood what an upgrade_lock does, but also that (b) my code shouldn't be subject to the deadlocks you're describing (though it's clearly susceptible to others :-).

The key (which isn't clearly noted in the documentation) is that an upgrade_lock is like a unique_lock in that only one upgrade_lock can be held at a time.  (I was thinking it was more like a shared_lock, in that multiple instances of it could be held at the same time, but Anthony Williams indicates that's not the case: http://groups.google.mv/group/boost-list/msg/2c93110e9e3dbb45.)  This means that the behavior in my code could go something like:

- transmit() acquires shared_lock.
- createRoom() acquires upgrade_lock.
- deleteRoom() attempts to acquire upgrade_lock and blocks (because of the upgrade_lock held by createRoom()).
- createRoom() attempts to upgrade to a unique_lock and blocks (because of the shared_lock held by transmit()).
- transmit() releases shared_lock.
- createRoom() successfully upgrades to a unique_lock.  (deleteRoom() is still blocked because createRoom() already has an upgrade_lock)
- transmit() attempts to acquire shared_lock and blocks.
- createRoom() releases unique lock.
- deleteRoom() successfully acquires upgrade_lock.
- transmit() successfully acquires shared_lock.
- life goes on

Now, if transmit() were regularly taking longer than 20 ms to complete, so that it was regularly being called reentrantly, I could imagine that it would build up a pretty good backlog of shared locks, which would prevent createRoom() and deleteRoom() from acquiring their unique locks.  But if I'm understanding this correctly, I can't think of a scenario that would result in the behavior I'm seeing, i.e., both the createRoom() and deleteRoom() methods blocking each other while attempting to acquire an upgrade_lock.  (Since no other unique or upgrade locks appear to be held elsewhere at the time.)

That said, I'm reasonably new to multithreading, and I'm still having to think through these issues, so I might be missing something.  (Actually, I clearly am, I'm just trying to figure out what it is :-).

Ken Smith
Cell: 425-443-2359
Email: smithkl42@gmail.com
Blog: http://blog.wouldbetheologian.com/


On Fri, Apr 23, 2010 at 11:42 AM, Frank Mori Hess <frank.hess@nist.gov> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 23 April 2010, Ken Smith wrote:
> On Fri, Apr 23, 2010 at 10:57 AM, Frank Mori Hess

> Yes, they are deadlocked because both upgrade locks are waiting for the
> other to release shared ownership so they can upgrade to unique ownership.
>
> Perhaps I wasn't clear.  They're not waiting on the upgrade_to_unique_lock
> lines, but on the upgrade_lock lines.

Despite my initial misunderstanding, your code still appears to be vulnerable
to the deadlock I described, even if that isn't what you are hitting at the
moment.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkvR6i0ACgkQ5vihyNWuA4VNfQCfevtL72DQKITwIm8TaH3hk6z5
TDAAnRyjfQVEitRqrlWtkzl/62EiG8Dc
=+p2O
-----END PGP SIGNATURE-----