Boost logo

Boost :

Subject: Re: [boost] [AFIO] Formal review
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2015-08-31 13:39:20


On 31.08.2015 19:18, Niall Douglas wrote:
> On 31 Aug 2015 at 5:26, Jeremy Maitin-Shepard wrote:
>
>> - At the very least, if these fat file handles are kept, a very strong argument
>> needs to be given as a rationale. (Perhaps you have found the overhead to be
>> negligible relative to the cost of the native file handle.)
>
> I really feel the need to strenuously disagree with the notion that
> AFIO has a "fat" handle. People have no comprehension of the cost of
> the open()/CreateFile() system call relative to a shared_ptr or any
> of the other things afio::handle does, and they are fixated on
> judging AFIO's design choice completely inappropriately as a result.
>
> To show what I mean, here is a gist demonstrating how lightweight
> AFIO's handle is relatively speaking:
>
> https://gist.github.com/ned14/affa007440f0568e4aa6
>
> This test simply compares opening and closing a file with the same
> thing plus a shared_ptr creation and destruction.
>
> On Linux: +4.76%
> On Windows: +1.8%.
>
> Before people go "aha!", remember that afio::handle has proper error
> handling and has to configure a number of other items, including
> asking the OS for what its "real" path is and other housekeeping.

Why does the library do that when not asked? Why not collect this data
only when the user requests?

> When I said that the overhead of shared_ptr was about 0.5%, I meant
> it. I benchmarked it before taking the decision.

I think others have pointed that performance is not the only issue with
shared_ptrs.

>> - I'd much prefer native_handle() to return the actual appropriate native handle
>> type for the platform, e.g. int on POSIX, rather than a void *. I'm guessing
>> this is done in order to support alternative dispatchers, that aren't
>> necessarily based on OS files at all, in the future, but perhaps a better way
>> of doing this that doesn't require casting the void* to an int, can be found.
>
> It's more ABI stability.

Stability with respect to what? Do you expect native handles to be
somehow different on the same platform? Or are you trying to maintain
stable ABI across different platforms? Why on earth would you do that?

> Casting void * to int is fine.

No, it's not. It's a hack that you impose on the library users.
According to [expr.reinterpret.cast]/4 it's not even required to
compile, as int may not be large enough to hold a pointer.

>> Directory handle cache:
>>
>> - A cache of open handles to directories is kept by AFIO by default. This is
>> particularly problematic because (as is documented) it makes reading directory
>> entries thread-unsafe by default. If this is to be kept, a very strong
>> rationale needs to be given, and the behavior must be changed somehow to
>> ensure that directory reading is reliable by default.
>
> The documentation explains exactly why there is this cache, it's a
> big win despite the complexity it introduces. The problem of races on
> directory enumeration on shared directory handles is eliminated if
> you make the number of entries enumerated in a single shot bigger
> than the total. In practice, this is not a problem, even on 10M item
> directories, just throw max_items at the problem.

Will this result in the process having 10M open handles permanently?

>> Enumerate API:
>>
>> - The future-based API of async_enumerate seems to be unnecessarily complicated,
>> due to the need to specify maxitems and deal with making repeated calls, and
>> all of the copying and memory allocation required to return a std::vector of
>> directory_entry.
>
> If you want race free snapshots (up to the platform specific
> guarantees), simply make sure maxitems far exceeds any potential
> contents.
>
> std::vector does not copy memory on C++ 11 when returned by value.

Correction: when returned with a move constructor or RVO.

>> - There is an attempt to document differences in behavior on different
>> platforms, but this is not done sufficiently. For example, it is stated that
>> async_write requires that the destination region does not go off the end of
>> the file, but on POSIX pwrite supports this just fine as a way to extend the
>> file. I assume that this restriction applies to Windows only.
>
> It's more that pwrite off the end of the file is inherently racy to
> concurrent read/writers. If you want safe automatic file length
> extension on writes, open a handle with the append flag and use that.
>
> AFIO fatal exits the process if you try to read or write past the
> current file size, it's a logic error and your program is inherently
> broken.

This is unacceptable behavior IMO. I would expect an exception in such case.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk