Boost logo

Boost :

Subject: Re: [boost] [AFIO] Formal review
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-09-01 22:36:10


On 1 Sep 2015 at 17:40, Jeremy Maitin-Shepard wrote:

> > > > Before people go "aha!", remember that afio::handle has proper error
> > > > handling and has to configure a number of other items, including
> > > > asking the OS for what its "real" path is and other housekeeping.
> > >
> > > Why does the library do that when not asked? Why not collect this data
> > > only when the user requests?
> >
> > The problem is the race free filesystem support which isn't as
> > simple, unfortunately, as simply using the POSIX *at() functions.
> >
> > You need a minimum of the true canonical path of the fd, and its
> > stat_t struct (specifically the inode, created and modified
> > timestamps). That's at least two additional syscalls just for those,
> > and AFIO may do as many as seven syscalls per handle open depending.
> >
>
> Can you describe in more detail exactly why this information is needed/how
> it is used?

This is a complex and lengthy answer if answered in full, and is a
large part of the AFIO value add. The housekeeping done during handle
open is used throughout AFIO to skip work in later operations under
the assumption that handle opens are not common.

Limiting the discussion to just the problem of race free filesystem
on POSIX, imagine the problem of opening the sibling of a file in a
directory whose path is constantly changing. POSIX does not provide
an API for opening the parent directory of an open file descriptor.
To work around this, you must first get the canonical current path of
the open file descriptor, strip off the end, open that directory and
then use that as a base directory for opening a file with the same
leafname as your original file. You then loop all that if you fail to
open that leafname, or the leafname opened has a different inode.
Once you have the correct parent directory, you can open the sibling.
This is an example of where caching the stat_t of handle during open
saves syscalls and branches in more performance important APIs later
on.

A similar problem exists for race free file deletions and a long list
of other scenarios. The cause is a number of defects in the POSIX
race free APIs. The Austin Working Group are aware of the problem.
Windows doesn't have problems with race free siblings and deletions
due to a much better thought through race free API, but it does have
other problems with deletions not being actual deletions and
different workarounds are needed there.

> > You can disable the race free semantics, which are slow, using
> > afio::file_flags::no_race_protection. However AFIO fundamentally
> > assumes that any file system path consuming function is slow and will
> > not be frequently called by applications expecting high performance,
> > so work is deliberately loaded into path consuming functions to keep
> > it away from all other functions.
> >
> > If you have a need to do a lot of file opens and closes and don't
> > care about races on the file system, you are better off not using
> > AFIO. STL iostreams is perfectly good for this as the only OSs which
> > allow async file opens and closes is QNX and the Hurd.
> >
>
> There is certainly a large gap between what is possible with STL iostreams
> or Boost filesystem and what is possible using platform-specific APIs for
> file system manipulations. While it is obviously your choice what scope to
> pick, I think it may be possible for it to be usable for everything you
> intend to do with it, e.g. for writing a database backend, but also to be
> much more widely useful.

It always easy to say "it should do everything optimally". If I had
more resources I could do much better than I intend to do. As it
stands, without sponsorship you get just 350-400 hours per year,
that's just two months full time equivalent. It's very limiting. You
have to rationalise.

> > > Stability with respect to what? Do you expect native handles to be
> > > somehow different on the same platform?
> >
> > AFIO's design allows many backends e.g. ZIP archives, HTTP etc. So
> > yes, native handles can vary on the same platform and I needed to
> > choose one for the ABI.
>
> Perhaps you could provide a less generic API for the native platform
> filesystem, and then once you also have support for ZIP archives, etc.
> create a more generic interface on top. While archive files and different
> network protocols like WebDAV and FTP can certainly be presented as file
> systems, particularly for read-only access, and indeed it is often possible
> to access them as file systems through the native platform APIs, they tend
> to differ sufficiently from file native systems that a different API may be
> more suitable.

I think this is another storm in a teacup. Dropping generic
filesystem backends just because native_handle() doesn't return an
int? Seems way overkill.

Generic filesystem backends could let you do loopback mounts and a
long list of value add scenarios. I would consider them a vital
design point.

> > > std::vector does not copy memory on C++ 11 when returned by value.
>
> > >
> > > Correction: when returned with a move constructor or RVO.
> >
> > Which AFIO makes sure of, and I have verified as working.
> >
>
> What I meant is that returning std::vector<directory_entry> means that you
> have to allocate memory for each path. In contrast, with a callback-based
> API, you could reuse the same directory_entry for all of the entries.

I can see what you mean. Ok, I'll have a think about it during the
engine refactor.

> > I do have a very big concern that it is a denial of service attack on
> > anything using AFIO by simply shrinking a file it is using from
> > another process to cause the AFIO using process to fatal exit, so I
> > will be trying to improve on the current situation in the engine
> > rewrite. It's not as easy as it looks if you want to keep reads and
> > writes fast, believe me, but again the ASIO reactor is in my way
> > because that is code I can't control. Once I've written a replacement
> > reactor I am hoping it can track scatter-gather writes a lot better
> > than ASIO can and therefore handle this situation better without
> > adding much overhead for tracking the individual operations.
> >
>
> Given that you go to great lengths to provide reasonable behavior in the
> presence of other types of concurrent modification, it is very surprising
> that the library makes it impossible to handle concurrent file truncation.

It's more we explicitly give up on trying.

File length queries are inherently useless. NTFS and ZFS have a
particular knack of straight out lying to you for some random length
period. I eventually decided it was safer to just not try.

> Writes could also fail due to running out of disk space or quota. Without
> this AFIO is essentially only usable in highly controlled circumstances,
> which would seem to make all of the other race guarantees unnecessary. For
> instance it would not be usable for writing a general file synchronization,
> backup, or file server program.

Also correct. Delayed allocation means the file system may only try
to actually allocate storage on a first write into that extent, and
that might fail. This would then blow up with a fatal app exit thanks
to AFIO's current implementation. As I said, I am very aware of this,
I just needed lightweight futures done before I could start the ASIO
reactor replacement.

In the meantime, nobody attempting database type systems ever allows
the filing system to run out of free space. It's still useful for
those sorts of application. Most major operating systems still can
suffer permanent unfixable damage from running out of free space on
their system drive, even in this modern age.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk