|
Boost : |
Subject: [boost] [AFIO] Formal review
From: Jeremy Maitin-Shepard (jeremy_at_[hidden])
Date: 2015-08-31 01:26:12
> 1. Should Boost.AFIO be accepted into Boost? Please state all conditions for
> acceptance explicity.
In its current state, Boost.AFIO should NOT be accepted into Boost. However, I
am confident that Niall will be able to deliver a future version of AFIO that
should be accepted into Boost after another review. There is a great need for
more powerful and efficient file handling functionality in C++ than is provided
by iostreams and Boost.Filesystem/Filesystem TS. I appreciate the enormous
complexity of trying to define a multi-platform filesystem interface, and it is
clear that Niall has done a ton of very valuable research on this topic.
> 2. What is your evaluation of the design?
A key strength of the design is that it provides a common interface for
asynchronous operations regardless of whether they are internally implemented
using Windows IOCP support, async IO on FreeBSD, or regular POSIX operations via
a thread pool.
Another key strength is that it exposes handle-based (i.e. *at), rather than
path-based APIs where possible in order to minimize race conditions, which tend
to be difficult or impossible to avoid completely with file system operations.
I do have a number of concerns about several aspects of the design, however:
Batch API:
- It is not clear what, if any, purpose the batch API serves. The significant
added complexity of using it would have to be justified by a significant
performance advantage, but it is not at all clear that there is even any
performance advantage at all.
- I suggest removing it unless there is a strong argument otherwise.
File handle API:
- The file handle abstraction in AFIO is very heavy weight, involving a
shared_ptr to a structure containing a whole bunch of fields in addition to
the native file handle, including a mutex, numerous flags, the path used to
open it, a pointer to a handle to the parent directory, a chrono time_point
specifying the time the file was opened, the total number of bytes read,
written, and written since last fsync using this file handle.
- Because file handles are at the core of AFIO, it is critical that users
precisely understand what this abstraction involves, and specifically how it
differs from a native file handle.
- I'd be much more comfortable with a much leaner abstraction, like a
unique_handle type that just contains the bare native handle, and uses unique
ownership semantics. The fat handles impose both computational/memory
overhead as well as mental overhead on the programmer. I understand these
extra fields are probably necessary to provide various functionality,
particularly emulation of behavior provided by other operating systems, but it
would be nicer to be able to opt-in to such functionality and only pay for it
when needed.
- At the very least, if these fat file handles are kept, a very strong argument
needs to be given as a rationale. (Perhaps you have found the overhead to be
negligible relative to the cost of the native file handle.)
- It should be possible to construct an AFIO handle from a native file
descriptor.
- On POSIX, AFIO should support supplying platform-specific open flags, like
O_NOATIME.
- I'd much prefer native_handle() to return the actual appropriate native handle
type for the platform, e.g. int on POSIX, rather than a void *. I'm guessing
this is done in order to support alternative dispatchers, that aren't
necessarily based on OS files at all, in the future, but perhaps a better way
of doing this that doesn't require casting the void* to an int, can be found.
Future-based API:
- I agree with Thomas Heller regarding the Future-based API:
- The implicit future.then built into the API should be eliminated. Instead,
API functions should take a file handle directly.
- future<T> should not implicitly store a file handle. The API functions that
open a file can return a future to a file_handle. API functions that don't
open a new file should just return a future to the actual result.
Inconsistency in how operations are defined:
- Some operations are methods of the dispatcher, some are methods of the file
handle, and others are free functions.
- I suggest that all file system operations be free functions, and take a
dispatcher as an argument.
Thread-local current dispatcher:
- Some operations involve an explicitly specified dispatcher, others seem to
obtain it from the file handle, and others default to a thread-local
dispatcher that has been set.
- Given that it is extremely likely for user code to execute from multiple
threads given the nature of AFIO, relying on a thread-local current dispatcher
seems very error prone.
- I suggest that the thread-local current dispatcher be eliminated, and that the
dispatcher always be specified explicitly to any API call that requires one.
Directory handle cache:
- A cache of open handles to directories is kept by AFIO by default. This is
particularly problematic because (as is documented) it makes reading directory
entries thread-unsafe by default. If this is to be kept, a very strong
rationale needs to be given, and the behavior must be changed somehow to
ensure that directory reading is reliable by default.
Enumerate API:
- The future-based API of async_enumerate seems to be unnecessarily complicated,
due to the need to specify maxitems and deal with making repeated calls, and
all of the copying and memory allocation required to return a std::vector of
directory_entry. I suggest instead something like passing in a
std::function<bool (expected<directory_entry> const &next)> callback (possibly
replacing expected by the new name for your monad type). An empty next value
means end of directory has been reached. Returning true means continue
listing the directory, returning false means stop listing the directory.
Miscellaneous comments:
- Operations including rmdir, rmfile, symlink, rmsymlink (as well as async
variants) that normally would not necessarily even operate on file handles,
actually return a handle_ptr. Perhaps if on some platforms a handle has to be
opened, then there could be a way for the user to specify that a handle is
desired. Otherwise, it would make sense to just return void or
std::future<void> equivalent for async variants.
- The documentation notes that on MS Windows, DOS short names are not supported.
I don't have much familiarity with Windows, but if programs generally accept
DOS short names when a filename is required, then AFIO should probably support
these names as well.
- The documentation notes: "Don't do directory enumerations from an x86 binary
on a x64 Windows system. This is due to a bug in Microsoft's WOW64 syscall
translation layer which Microsoft have decided is wontfix. " Given that a
major purpose of AFIO is to provide a portable abstraction that works around
OS limitations, AFIO needs to support directory enumeration even in this case
(e.g. by detecting WOW64 and using a synchronous call via a threadpool in that
case).
- It is noted that to do an asynchronous stat you have to use enumerate with an
appropriate glob specified, but constructing such a glob may be non-trivial if
the path contains characters that are interpreted specially by the glob
syntax, and impossible to do in a race-free manner.
- As a general matter, it is important to provide leanest possible API that
offers the most flexibility to users, in order to avoid the case that users
choose not to use the library despite it offering a lot of functionality that
is useful to them, because of excessive overhead or insufficiently powerful
interfaces. I often forgo use of Boost Filesystem and C++ iostreams for this
reason.
3. What is your evaluation of the implementation?
- I understand that Niall already intends to rewrite much of the implementation,
making much of the current implementation irrelevant.
- As has been noted by other reviewers, there are a bunch of #if 0 blocks, many
undocumented. Some seem to contain work in progress, which might better be
placed in a separate branch. Some contain debugging code, in which case it
might be better to replace #if 0 with #ifdef BOOST_AFIO_DEBUG_XXX where XXX
specifies the particular feature being debugged. That way it is
self-documenting. Some seem to contain alternate implementations. Ideally
these would also be guarded by a descriptive macro, and a unit test case be
defined that tests it as well to make sure it is still meaningful. Otherwise
#if 0 code tends to bitrot, which admittedly is not such a great problem if it
is just for debugging anyway, but I'd say it is less than optimal.
- The implementation might be better split up into multiple independent files.
There are various independent facilities, like secded_ecc, that could be
placed in a separate file.
- Although this tends to be a problem with most Boost libraries, it would be
nice if the compile times could be reduced, particularly when not in
header-only mode. Some libraries like Boost Spirit have an interface
inherently based on template metaprogramming, such that long compile times are
pretty much unavoidable, but the Boost AFIO interface seems to be mostly
non-template based, which suggests that it may be possible to get pretty good
compile times.
> 4. What is your evaluation of the documentation?
- The extensive documentation about race guarantees is extremely useful.
- I appreciate the in-depth examples.
- Much more design rationale is needed.
- The term "filing system" is used in place of the more common "file system",
leading me to wonder if there is a difference in meaning.
- Every instance where there is a platform-specific #ifdef in an example should
be evaluated, so that you can figure out the appropriate platform-independent
abstraction that could be added to AFIO in order to move that #ifdef into the
library and out of user code.
- There is an attempt to document differences in behavior on different
platforms, but this is not done sufficiently. For example, it is stated that
async_write requires that the destination region does not go off the end of
the file, but on POSIX pwrite supports this just fine as a way to extend the
file. I assume that this restriction applies to Windows only.
- The semantics of the Batch API are not precisely documented, nor is any
rationale given for it. For instance, I understand that the operations
themselves cannot happen atomically, since that is not supported by operating
systems, though this is not clearly stated. It is stated: "Once a batch of
input ops has been verified at the point of entry as not errored, you are
guaranteed that the batch is atomically scheduled as a whole, unless a failure
to allocate memory occurs." Does this means that if the *_req object involves
futures, it waits for all of them to be ready before launching any of the
operations in the batch? Is that the purpose of the Batch API, to do a
wait_all on all of the relevant futures, then to launch each operation? This
issue goes away though if the Batch API is eliminated, as seems to make sense.
- It would be useful to document for each operation which underlying system
calls it invokes on each platform.
- Many operations other than open (e.g. rmfile) take file_flags, but it is not
specified which flags are supported.
- References to historical versions of AFIO (prior to the review) should be
removed from the main parts of the documentation, since they only distract
from the useful information. This information could be kept in a different
section at the end for the extra curious.
- Formatting of the "description" in the Boost Quickbook format function
documentation has some problems: the lack of line breaks between the general
description, other notes, and the common note about the use of NT kernel paths
make it difficult to read.
> 5. What is your evaluation of the potential usefulness of the library?
The library is potentially extremely useful. C++ and Boost iostreams and
Boost/TS Filesystem provide only very limited functionality. The C library
libuv is a close competitor but seems to offer less functionality and a less
convenient API. For achieving similar functionality, the status quo in C++ is
to rely on platform-specific C APIs that are much more cumbersome to use.
> 6. Did you try to use the library? With what compiler? Did you have
any problems?
I tried some very simple usage of the library with GCC 5.2.0 on Linux. There
were a few warnings (variable set but not used) with -Wall that would be best
eliminated.
- The error messages are not ideal: for instance, in the case that there was an
error opening a file, this is the result:
Dynamic exception type: boost::system::system_error std::exception::what: File
'/home/jbms/tmp/afio-test/test.txt' not found [Host OS Error: No such file or
directory (2) in
'/home/jbms/tmp/boost.afio/include/boost/afio/v2/detail/impl/afio.ipp':
async_io_handle_posix:805]: No such file or directory. Op was scheduled at:
1. 0x42fb4f: ./test (+0x2fb4f) : No such file or directory
Some of this is only useful when debugging AFIO, some only useful when debugging
the application, and there is a lot of redundancy, for instance the actual error
"No such file or directory" is shown in 3 places. Ideally there would be a way
to get a simple message along the lines of "Error opening file
'/home/jbms/tmp/afio-test/test.txt': No such file or directory" that could be
shown directly to the user.
I tried to #define BOOST_AFIO_EXCEPTION_DISABLESOURCEINFO to see if that would
make the error message better, but instead that seemed to trigger a bug:
Dynamic exception type: std::logic_error
std::exception::what: basic_string::_S_construct null not valid
- A separate issue is that attempting to read a range that is not present in a
file, something that the library should handle elegantly, leads to the program
aborting immediately with an error message, rather than allowing the error to
be handled:
FATAL EXCEPTION: Failed to read all buffers
1. 0x40f0f4: ./test (+0xf0f4) terminate called after throwing an
instance of 'std::runtime_error'
what(): Failed to read all buffers
zsh: abort (core dumped) ./test
> 7. How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?
I gave the library a fairly in-depth study, spending at least 8 hours in total.
I looked at the documentation fairly thoroughly, looked through parts of the
implementation, and tried running a couple small programs that use the library.
> 8. Are you knowledgeable about the problem domain?
I have a fair amount of experience with asynchronous programming in C++, and I
have pretty extensive knowledge of the file system APIs on Linux.
As a final note, I'd like to thank Niall for all of the work he has done to
get the library this far. A powerful, elegant, cross-platform interface for
file system operations is very much needed in C++, I really do want to see
(a future version of) this library in Boost and use it myself.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk