Boost logo

Boost :

From: Kim Barrett (kab_at_[hidden])
Date: 2006-03-09 17:45:21


In reply to my review comments, Ion Gaztanaga wrote 2/17/06:
> > A reference count of create + open - close is maintained for shared
> > memory objects. If this count reaches zero, the shared memory
> > object is unlinked (at least in the posix version). The lack of any
> > documented unlink mechanism might lead one to guess that something
> > like this is going on, and is documented by the last paragraph of
> > this section ("When all processes ... close ..., the shared memory
> > segment is destroyed"). A bit more emphasis might be useful here.
> > On the other hand, I'm actually not convinced this is a good idea.
> > It is certainly fragile, in that a program which crashes (for
> > whatever reason) won't close any shared memory objects it has open,
> > resulting in the reference tracking getting messed up.
>
> You can have problems in POSIX systems, but I couldn't find a better way
> to implement this. If a program crashes there is no way control
> anything. The unique solution would be to provide a function that
> destroys all named objects that I can register with every creation so
> that you can catch signals and call that functions to free all objects.
> In windows the OS frees the resources automatically. For standard C++
> IPC mechanisms I would request OS help for program crashes, just like
> heap memory is freed automatically.
>
> > For example, one couldn't start up a
> > process which parses some data into an in-memory format that it
> > records in shared memory and then exits, with other programs saving
> > parsing time by just getting the information from shared memory.
> > This doesn't work if those other programs don't get around to
> > opening the shared memory before the parser program exits.
>
> I would like to implement the POSIX-like behaviour in windows, but that
> would require some permanent store that or a server/ process/service
> that serves named IPC mechanisms windows. You can use memory mapped
> files for this behaviour. Take in care that POSIX unlink mechanism is
> also complex so that if a process unlinks the shared memory, if another
> process can create a new shared memory with the old name while older
> processes are still attached to the old shared memory. I need help from
> POSIX experts.

First let me make sure I understand what is going on here.

On Windows, the shmem library is presently using create_file_mapping
and open_file_mapping, while on POSIX systems it is using shm_open.

An object created with create_file_mapping exists until there are no
references, assuming I'm understanding what you've said. (I don't
have (easy) access to Windows API documentation, so can't go look up
this information. Please correct me if you see any confusion.)

A POSIX shared memory object exists from the time it is created until
it is unlinked and all references are gone (or the system is
rebooted). (It becomes inaccessible to further shm_open calls if
unlinked, but remains open to processes that had already opened it.)

What the present library implementation is trying to do with this
reference count mechanism is to emulate the Windows behavior on POSIX
systems. Unfortunately, as has been noted, that emulation really isn't
very reliable in the face of ill-behaved (i.e. crashing) clients. And
I'm pretty sure there isn't a solution to that problem, at least not
with the shm_open &etc API.

First question: Why not use the shm_open interface on Windows? One
possible answer would be that the Windows POSIX support doesn't
include the shm_xxx API. And that might even be the answer, since
some web searches have led me to suspect that Windows only supports
the SysV shared memory API. Which leads to

Second question: Why use different implementations on different
platforms? Why not use the (admittedly somewhat clumsy to use
directly) SysV shared memory API, which is pretty widely supported?
It's a little more painful to use, but that's an implementation
detail that won't be exposed to library clients. One issue might be
kernel limits on min/max size and number of objects; some of the
references I have make mention of such but don't provide much detail.
On my stock configured SuSE9.3 machine I see max number of
identifiers 4096 and max size 33554432 (32768 * 1024). SysV shared
memory also doesn't support resizing, which is available when using
the shm_open API (I don't know whether that feature is available with
the create_file_mapping API). I don't know if that actually matters
for this library. Note that one of the capabilities of SysV shared
memory is obtaining the current number of attaches. I don't know if
it reliably gets decremented when an attached process exits without
explicitly detaching though; I would hope so, but haven't verified
it. If so, that could provide a correct reference count
implementation. Note also that the size is also recorded, which I
think would eliminate the need for the header in the current
shared_memory implementation.

Hm. If the SysV API provides a correct reference count
implementation, and is widely supported, including on Windows, then
using that API would permit either lifetime behavior to be
implemented (possibly even configurable).

Of course, we have yet to have the discussion of which behavior is
actually preferable. But it is certainly the case that the present
situation, where the behavior is not documented and is buggy on
non-Windows platforms, is not ideal.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk