Boost logo

Boost Users :

Subject: Re: [Boost-users] [interprocess] Fault recovery in managed mapped file
From: Kevin Arunski (kevin.arunski_at_[hidden])
Date: 2010-11-18 16:05:34


On Nov 18, 2010, at 3:07 PM, Ion Gaztañaga wrote:

> On 17/11/2010 15:14, Kevin Arunski wrote:
>> I have found that managed mapped file can get stuck in a spinlock
>> if the
>> file is not closed and fully flushed to disk. For example, if the
>> power
>> is pulled from computer while a segment is open and before the first
>> page in the segment has been committed. In this case it is common
>> for a
>> journaled filesystem to preserve the fact that the file was
>> created, but
>> it has lost the contents of the file and now the file appears to be
>> zero'd out. I have observed this behaviour on Linux systems running
>> ext4, for example.
>
>> Some possible solutions:
>>
>> If, at this point we have opened a file and not created it, why
>> wait for
>> an UnitializedSegment to change state? If the segment is Unitialized
>> here then simply throw an error. Make it the caller's
>> responsibility to
>> ensure the segment is created/initialized before it opened in a
>> read-only mode.
>
> The reason is to support simultaneous open and create, as you
> indicate below.
>
>> Perhaps, if you want to allow multiple processes to do open and
>> create
>> simultaneously without any additional synchronization mechanism, you
>> could accomplish that by adding a count of open mappings into the
>> shared
>> segment. If the reference count is 1 at this point, don't attempt
>> this
>> spinlock because the state of the file is never going to change. In
>> this
>> case throw if the *patomic_word is != InitializedSegment.
>
> A count does not work, because if a process dies, then you have a
> wrong count. If you need to commit the first page to avoid power
> errors, call flush() just after creating the managed segment

Understood. I have been using flush() to commit managed file segments
to save them; and indeed that does work fine. The problem comes when
the crash occurs between opening the file and calling flush. I could
move the flush() earlier ahead in the process, though, to reduce the
change of this situation happening. But, even if this allows the open
to proceed, how much can I tell about the file since changes were made
after the flush? If for example, I wanted to set a dirty flag within
the segment itself, wouldn't I run the risk of the allocation
structures within the segment being corrupt, leaving me unable to find
the offset of my flag?

>
> Anyway, trying to use a mapped file after a hard shut down has no
> sensible recovery, you don't know which parts of the file the OS has
> committed, the internal data structure might be absolutely corrupted.
>

Indeed, I do not want to use the corrupted file at all, but I have no
way to tell if the file is corrupted or OK. If I try to open the
segment read only and examine it, I get stuck in the loop with no way
to detect the failure. This is the problem I am seeking a solution
to. From looking at the code it appears that if, for whatever reason,
the first 32 bits of the file are 0, and the file is opened read-only,
then I am stuck.

I was able to solve the issue for my purposes with this change:

diff -r boostb/interprocess/detail/managed_open_or_create_impl.hpp
boosta/interprocess/detail/managed_open_or_create_impl.hpp
353c353,358
< while(value == InitializingSegment || value ==
UninitializedSegment){

---
 >          if (value == UninitializedSegment)
 >          {
 >         	 throw interprocess_exception(error_info(corrupted_error));
 >          }
 >
 >          while(value == InitializingSegment){
But, as you can see if the user intends to use the open and create  
simultaneously as a synchronization mechanism it will fail.  This is  
ok for me because I already have synchronization elsewhere in my code  
that prevents that scenario.
Perhaps, rather than spinning indefinitely, there could a timeout or  
limit other on how long the open function will wait for the file to  
become initialized?  I assume from the fact that you chose a spin lock  
that you didn't intend for the user to wait indefinitely.
KEVIN ARUNSKI

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net