Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2020-05-16 18:35:44


Phil Endecott wrote:
> Can we improve how interprocess mutexes and condition variables
> behave on process termination?

Having given this some more thought:

I think it would be useful if Boost.Interprocess added
a robust mutex, as a straightforward wrapper around the
POSIX robust mutex and equivalents on other platforms if
they exist. I note that there is a patch that does this
on the Interprocess issue tracker but it unconditionally
cleans up the mutex when it find that the other process
died, which is wrong. I believe that the lock() method
should fail in that case, and it should provide a
make_consistent method that the user can invoke if
appropriate before retrying. Then read and write locks,
with appropriate clean-up behaviour, can be implemented
on top of that.

Vinicius dos Santos Oliveira <vini.ipsmaker_at_[hidden]> wrote:
> After some more thought, here is another idea: PTHREAD_MUTEX_ROBUST
> is no longer a property of the mutex, but a property of the lock.

I don't see how that can be implemented on top of the
POSIX API, where robustness is a property of the mutex.

Andrey Semashev <andrey.semashev_at_[hidden]> wrote:
>> * PTHREAD_MUTEX_ROBUST might be part of the solution. That seems
>> to require the non-crashed process to do clean up, i.e. we would
>> need to record whether the crashed process were reading or writing
>> and react appropriately.
>
> You can't do that reliably because the crashed process could have
> crashed between locking the mutex and indicating its intentions.

I don't follow. Say I have a bool in the mutex called being_written.
It's initially false, the read lock doesn't touch it, and the write
lock does:

lock() { m.lock(); being_written = true; memory_barrier(); }
unlock() { memory_barrier(); being_written = false; m.unlock(); }

If the process crashes between locking and setting being_written,
then the process doing the cleanup will see being_written = false,
and that's OK because the crasher hadn't actually written anything.

Regarding blocking signals, I agree this is not really something
that should be part of the interprocess synchronisation primitives,
but I do think that a modern wrapper around the ancient C signals
API would be good to have.

>> I'm less clear about what happens to condition variables, but it
>> does seem that perhaps terminating a process while it is waiting
>> on a condition will cause other processes to deadlock. Perhaps
>> the wait conceptually returns and the mutex is re-locked during
>> termination.
>
> AFAIR, pthread_cond_t uses a non-robust mutex internally, which means
> that condition variables are basically useless when you need robust
> semantics.

Yes.

> If you need a condition variable-like behavior, in a robust way, I think
> your best bet is to use futexes directly.

Yes, that is the conclusion that I've also come to - but it is
probably a very difficult problem. Note that robust mutexes use
futexes rather differently from regular mutexes, and there is
kernel involvement at process termination (see man get_robust_list).
A robust condition variable would have to do something similar.

I find this all rather surprising, as interrupting a waiting condition
variable is often much more common than interrupting a locked mutex.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk