Boost logo

Boost :

From: Jeremy Maitin-Shepard (jbms_at_[hidden])
Date: 2003-12-01 14:44:51


Beman Dawes <bdawes_at_[hidden]> writes:

>> bool equivalent(path const &lhs, path const &rhs);
>>
>> Tests if the two arguments refer to the same file. This is useful even
>> outside the context of hard links, since it avoids the problem of
>> having to compare paths syntactically. Throws if either of the paths
>> does not exist or is inaccessible.

> My understanding, based on posts to microsoft.public.win32.programmer.kernel, is
> that there is no way on Windows to implement this function such that it will
> work reliably. The dwVolumeSerialNumber/nFileIndexHigh/nFileIndexLow technique
> you use was specifically mentioned as unreliable. I forget the exact case, but
> it was something like network files, non-NTFS file systems, or perhaps the
> combination of both.

> I'd love to be proved wrong about that. If you have your code running on
> Windows, could you give it a try to say a floppy or CD-ROM across a network
> share? I'm concerned that one or more of the dwVolumeSerialNumber,
> nFileIndexHigh, or nFileIndexLow members aren't filled in properly.

This function is undeniably useful, and so the fact that it does not
work in some obscure cases on a single platform does not seem like
sufficient reason not to provide it. (On the contrary, this seems like
something that should be fixed in Windows NT.) On FAT filesystems or
other Windows filesystems that do not support hard links, this function
can be implemented simply by comparing the normalized path names. If
necessary, extensive filesystem-type checks could be added to the
Windows implementation in order to work around these few cases.

>> unsigned int link_count(path const &ph);
>>
>> Returns the number of hard links to the file specified by the
>> argument. (On platforms that do not support hard linking, this always
>> returns 1.) Throws if the path does not exist or is inaccessible.

> What are some use cases for this function? Do they degrade gracefully on
> platforms which don't support hard linking? Do they degrade gracefully on
> platforms which support hard linking on some file systems and not
> others?

The use case is programs like tar or du -- a typical use is to store a
unique identifier in a hash table for each file with a link count
greater than 1, so that the file is only backup up (in the case of tar)
or counted (in the case of du) once.

According to Microsoft's documentation, my implementation of this
function should be portable to all 32-bit versions Windows. On
filesystems that do not support hard links, it always returns 1.
(Clearly it works on POSIX platforms).

>> void link(path const &existing_path, path const &new_path);
>>
>> Creates a hard link from `new_path' to the file referred to by
>> `existing_path'. Throws if `existing_path' does not exist, if
>> `new_path' does exist, or if the operation fails for any of a variety
>> of reasons.

> There are many platforms and file systems which don't support hard
> linking. Windows doesn't every support hard links for directories, IIRC, and
> even POSIX doesn't require support, again IIRC. Thus I've got serious concerns
> about supplying something that is very likely to not be portable. I considered
> limiting the hard links to files (and not directories). But it seems a tricky
> feature with only mildly compelling uses.

On all platforms, hard linking a particular file might not be supported
for any number of reasons, such as the filesystem underlying that
particular file not supporting hard links, or the existing path and the
new path being on different filesystems. But these limitations are
well expected. The lack of functionality like hard linking, or placing
unnecessary restrictions (such as prohibiting directory hard links on
platforms that support them) would only serve to limit the adoption of
boost.filesystem. It certainly would be necessary to make additions to
the boost.filesystem exception system in order for it to deal with
hard-link-related errors.

>> Note that the code in the attached patches has not been tested on
>> Windows, and due to the use of CreateHardLink, it may be the case that
>> it does not compile or run on versions of Windows older than Windows
>> 2000.

> Or file systems other than NTFS even on modern Windows?

I was specifically referring to the issue that the function
CreateHardLink only exists in Windows 2000 or later. It is possible
to implement link in terms of a function BackupWrite, which exists on
all versions of Windows NT. Someone actually posted such an
implementation on the boost mailing list for possible use in
boost.build (I am not sure exactly why). You can find the message at:
http://lists.boost.org/MailArchives/boost/msg47649.php

I simply did not bother to use such an implementation in this
proof-of-concept code. There is still the issue of BackupWrite not
being available on Windows 9x; I believe the solution is to dynamically
request the function using LoadLibrary and GetProcAddress.

>> It would be convenient if there were a way to provide a unique
>> identifier for a file, for use in an associative container.
>> Unfortunately, although Windows does provide a 64-bit identifier for a
>> file, the documentation indicates that these identifiers are only valid
>> while there is an open handle to the file. I suppose it would be
>> possible to keep an open handle to the file as part of the identifier,
>> but such a system seems highly resource intensive.

> While it would be nice, I don't think such a feature would be worthwhile unless
> it worked portably across a very wide spectrum of operating systems and file
> systems.

I am not sure it would be wise to provide such unique identifier
support if the Windows implementation required keeping an open file
handle. Nonetheless, it does seem this is a viable implementation, and
on platforms that do not support hard links, the normalized path name
can be used as the unique identifier. Since POSIX platforms do provide
usable unique identifiers, this covers a large number of platforms.

> If we had some way to force operating systems and file systems to provide the
> features we would like, then Boost.Filesystem would quickly acquire a long list
> of additional functionality. But in the real world that isn't possible. The
> practical threshold in my mind is POSIX + Windows. If some potentially useful
> feature can be implemented such that it's behavior is portable to those two
> operating systems and their common file systems, I'm willing to add it. But if
> portable behavior between POSIX and Windows isn't possible, I think such
> functionality is highly questionable unless there is a safe fall-back
> implementation.

Okay.

-- 
Jeremy Maitin-Shepard

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk