Boost logo

Boost Users :

Subject: Re: [Boost-users] [interprocess] Sharing data in a peer-to-peer fashion
From: Andy Wiese (andyw_at_[hidden])
Date: 2010-03-19 01:06:17


On Mar 18, 2010, at 9:25 PM, Brett Gmoser wrote:

> Hello,
>
> I'm working on what seems to be a fairly interesting problem, and
> I'm looking for any interprocess experts to lend any advice. After
> reading all of the documentation for Interprocess, it seems that all
> of the examples work with a "one parent, many children" type model.
> The problem that I am having is that I am working with a "many
> children, no parent" model. My processes are spawned by the web
> server (and FastCGI), so there is no easy way to modify that code to
> manage the interprocess data.
>
> At least one instance of my program is meant to stay alive forever,
> so I'm not very concerned about deleting the interprocess data at
> any time. My problem however is the mutexes, and the stale locks
> that result if one of the processes crashes or exits abnormally. It
> seems that once an instance of my application crashes, no other
> instance may ever get a lock on the shared mutex, because the
> original process still holds a lock.
>
> I've thought of a few different ideas to solve this problem:
>
> * Using an interprocess shared_ptr for the mutex and data, with a
> custom deleter to remove all instances once the last application is
> exiting and the use count is zero. However this suffers from the
> same problem - it seems the use count is never decremented when the
> application exits abnormally (for example, kill -9, crash, or CTRL+C).
>
> * Using a "heart beat" type system to keep track of processes. My
> idea was to do something like this: Keep an interprocess associative
> array of process ID's mapped to last heartbeat time, with each
> process updating it's own heartbeat. At the same time, every other
> process (since there is no one parent process) must keep track of
> the heartbeats, and remove processes from the array which have not
> responded with a heartbeat in awhile. The problem is this - what
> about the mutex for the associative array? And the locks that
> another application might have on them? I wind up back at the point
> that I started at. The stale locks are destroyed if the mutex is
> deleted (via boost::interprocess::named_mutex::delete), and I can
> detect the stale lock pretty reliably with a timed try lock (wait 30
> seconds or so to aquire the lock, if it can never be acquired then
> there is obviously a problem). But what of thread safety, since I
> now have to delete the lock and re-create it, where other processes
> might wind up doing the same thing? And what if those other
> processes are also trying to obtain a lock on the mutex at the same
> time I'm deleting and recreating it?
>
> Those are pretty much the only ideas I've had. Does anybody have
> anything better? Maybe some of you have tackled a similar problem?
>
> Thanks!
>

I think I just asked the same question, less elegantly. I am using
file_lock, which seems to keep the guarantee of release when the
owning process crashes, at least on the unixy systems I care about. I
am considering similar heartbeat solutions to what you propose, since
I would really like to use named_recursive_mutex.

Andy


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net