Boost logo

Boost :

Subject: Re: [boost] Windows Develop Tests Failing
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-03-14 21:39:14


On 14 Mar 2015 at 13:11, Beman Dawes wrote:

> On Fri, Mar 13, 2015 at 8:26 PM, Niall Douglas
> <s_sourceforge_at_[hidden]> wrote:
> ... deleting a directory tree
> > which will randomly fail on Windows if you do it too quickly :).
>
> I've run into that several times, but only when TortoiseGit, or more
> precisely its cache, is running or was running very recently. I
> wouldn't be surprised if similar cache programs exhibit the same
> behavior on any operating system that prevents deleting a directory
> tree while another program has any of the directories still open.

The cause of this is actually very interesting, or maybe it is just
to people interested in filing systems like you and me. Anyway, time
to bore the list ...

NT inherited from VMS the notion of pending deletion never actual
deletion, so when you delete a file you actually don't delete it, you
merely tag it as likely to be deleted at some future point. Here's
how DeleteFile/RemoveDirectory works internally:

1. Open the file/directory as a HANDLE with DELETE privs.

2. Tell the kernel to toggle the PendingDelete boolean using
NtSetFileInformation.

3. Close the handle.

The PendingDelete flag being set has two consequences. Firstly, all
new file handle opens with read or write privs will now fail with
STATUS_DELETE_PENDING (ACCESS_DENIED in Win32), though you can still
open a new handle if you ask for no privs. Existing handles are
unaffected. Secondly, as the reference count for the handle
decreases, when it hits zero the PendingDelete flag means to do the
following:

1. Mark as hidden the file name in its directory. It will no longer
appear in directory enumerations, but attempting to create a file
with the same name will return a STATUS_DELETE_PENDING error with no
apparent cause.

2. Secure erasing now occurs, which on CIA/NSA editions of Windows
means multiple scrubs of the file contents, for each of the named
streams attached to the file entry.

3. Actual deallocation of the inode and extents containing data now
occurs. These get scheduled to be flushed to the disc as soon as
possible.

4. Once the extents deallocation hits the journal, only now does the
file name become actually deleted from the directory and a new file
with the same name can be created. On marking the entry as empty,
Windows again flushes the directory to physical storage (i.e. a
fsync). Note that on a busy hard drive it can take milliseconds for
the extents deallocation to reach physical storage.

Note that so long as the file name remains in the directory, even if
hidden, you *cannot* delete that directory because the directory is
not empty, even if it is indistinguishable from being empty.

You can at this stage see how many ways a directory tree delete can
fail. Firstly, any program holding open a handle to any file or
directory in the tree will prevent deletion occuring, and therefore
directories are not empty, and therefore cannot be removed. As you
mentioned, TortoiseGit is a devil for that, but so are virus checkers
or anything else which opens file handles.

Secondly, if you try to delete a directory tree too quickly - which
AFIO usually does because it will parallelise deletes on all CPU
cores - you get caught by files taking up to a millisecond in the
"file entry hidden but not actually deleted" stage which stops the
directory being deleted. Not being able to delete a directory means
everything higher up the tree can't also be deleted. It's a big pain.

There is also a big problem with these semantics and lock files. If
many threads are creating and deleting a lock file quickly, much of
the time you get back access denied errors due to the zombie "being
deleted" stage rather than more obvious errors like "file already
exists". You also get enormously lowered lock file performance such
that Windows looks very slow compared to Linux.

All of the above plague AFIO's unit testing on Windows because code
which works perfectly on POSIX will fail in all sorts of random ways
on Windows, and AFIO does a lot of heavy filing system stress
testing. So, in the v1.3 release (any day now, everything is finished
except for fixing the last of the filing system races as AFIO now has
a "race free" mode) I've added the following workarounds:

1. When tagging deletion, first rename it to a 128 bit hex crypto
random name. DELETE privs also allows renaming, so this allows a new
file with the same name to be immediately created. Performance with
this workaround alone is about 20x faster, and suddenly NTFS looks
competitive to ext4.

2. I'm shortly about to improve workaround 1 by renaming the about to
be deleted item to live somewhere else on the same volume not in its
original directory, and still with its crypto random name. I have yet
to write logic to figure out some suitable other location on the same
mounted volume, but it should be easy enough. This improved
workaround stops pending file deletion getting in the way of deleting
directory trees, and should make Windows filing system semantics
identical to POSIX [1][2].

[1]: Well, apart from renaming. Windows does not permit renaming a
directory containing an open file handle, so some future AFIO version
may depth rename all the contents of a directory to the temp
location, rename the directory, and then rename all the contents back
in again. Yes this is completely daft. The only good news is that
renames when using the NT kernel API are amazingly quick, and atomic,
because metadata flushes of the containing directories don't appear
to occur until the handle is closed. As the NT kernel API requires
renames to open a handle to the item, you simply hold open the item
during the switch out/switch in.

[2]: The only other semantic difference is in symbolic link
traversal. Windows doesn't have the same semantics as POSIX period. I
can't do much about that.
Most of the time no one will notice however.

Okay, boring the list is over! Hopefully something in the above might
be useful in helping Boost.Filesystem work around NT idiosyncracies.
As much as they appear to be a pain, there is a logic to them, and
judicious use of renaming can allow a reasonable emulation of POSIX
semantics.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk