Boost logo

Boost Users :

Subject: Re: [Boost-users] endless loop and/or crash in boost::interprocess
From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2013-05-21 16:11:42


El 20/05/2013 20:28, Charanga escribió:
> How do I debug a potential endless loop and heap corruption issue
> involving boost::interprocess "new managed_shared_memory" freezing
> forever or crashing?
>
>
> I posted this on stackoverflow but I am hopeful that a real expert
> in Boost::Interprocess may have seen this freeze-up and crash
> before.
>
> Outside the debugger it manifests as a freeze-up, inside the debugger
> it manifests as a series of exceptions inside an endless
> try-catch-ignore loop which is coded into the boost implementation
> code.
>
> http://stackoverflow.com/questions/16651878/how-do-i-debug-a-potential-endless-loop-and-heap-corruption-issue-involving-boos

Warren, I'm really sorry if an interprocess bug has provoked an endless
debugging session. I'll try to help as much as I can.

I understand your frustation, but if you really want help I think you
should post less passionate comments on stackoverflow. I know I wrote
some inelegant code in Interprocess, but don't suppose I don't know
about ERROR_ALREADY_EXISTS in CreateFile, it's used in other parts of
the library ;-) Please consider editing you post in SO, at least with
the help you might get in this mailing list, we try to do our best.

The code you mention is trying to map a "mappable" device, where a
device can be a file, a shared memory object, /dev/null or System V
shared memory segment. In POSIX open-like interfaces, a user doesn't
know if the device was created or opened. You try to create it with
O_CREAT|O_EXCL and if EEXIST is triggered you try to open it. You might
get a race condition (someone has removed the file just after trying to
create it), so if ENOENT is triggered then you need to retry the
creation. Just like a spinlock trying to get access to a critical
section. That's why "device" created code needs to loop. So the word
"file" in the comment is a bit misleading, it can be a file or any other
mappable descriptor os the OS.

Interprocess needs to know if the device was created or opened because
in case of creation it needs to initialize internal data structures for
managed segments. Device error interface is exception based (I always
wanted to offer an exceptionless interface to Interprocess, but never
got time to implement it), the loop has to catch exceptions to know why
the file was not created or opened.

In your case I don't know which Boost version your code is using, but
reviewing the code it seems that the OS (or some lower Interprocess
layer) is saying that the device (e.g. a file), can't be created because
it already exists, but when trying to open it, an error says the file
has just disappeared. And the code tries it again and again. I don't
know where the heap corruption comes from. We can just try to do the
operation several times and just return an error, that wouldn't block
but the underlying bug would be still there.

In Windows the shared memory device is emulated (to obtain posix-like
file semantics) with a real file in a subfolder of shared documents
folder (a subfolder of the registry value
"SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Explorer\\Shell Folders"
, e.g. in Windows 7 is in C:\ProgramData\boost_interprocess). I don't
know why this loop never ends but maybe user permissions or any other
attribute of that file makes Interprocess code think the file is created
when trying to create it and the file is not present when tries to open it.

Best,

Ion


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net