Boost logo

Boost :

Subject: Re: [boost] [filesystem] proposal: treat reparse files as regular files
From: Paul Harris (harris.pc_at_[hidden])
Date: 2015-07-30 00:05:38


On 29 July 2015 at 18:59, Niall Douglas <s_sourceforge_at_[hidden]> wrote:

> On 29 Jul 2015 at 18:09, Gavin Lambert wrote:
>
> > On 29/07/2015 14:06, Niall Douglas wrote:
>
> > > I appreciate you're saying the cost is worth it, but we're thinking
> > > all Boost users here, not just the small minority on Windows Server
> > > 2012 with dedup turned on.
> >
> > I'm not on Server 2012, but this thread caught my attention because I
> > remember encountering a bug that prevented all WinXP clients from
> > accessing deduped files on CIFS shares provided by Server 2012. I think
> > in the end this was a server-side bug related to McAfee and the
> > different protocols used by WinXP vs. Win7, and so clients shouldn't
> > normally be able to see whether files are deduped or not remotely, but I
> > haven't explicitly verified that. If CIFS shares do expose files as
> > dedup reparse points instead of concealing that then it might affect
> > quite a lot of users.
>
> I had understood from the OP that CIFS is exporting the reparse point
> tag to clients, hence the breakage.
>
> The reason, I suspect, that CIFS is being so braindead here is that
> opening a deduped file is more expensive than usual and clients ought
> to know. Which is exactly why I am opposed to treating these things
> as a regular file.
>
>
On the topic of "this file will be slow to read", IMHO this is an
orthogonal issue.
It might be nice to be able to query some sort of "this will be hell slow
to read" status so I could perhaps do something about it,
But the files (slow or not) should still be treated as normal files. This
problem is bigger than just reparse-point files.

Reparse-point files (not symlinks/junctions) are just one type of
maybe-this-will-be-slow files.

Reading off the local underutilised disk is a lot faster than a local disk
suffering high IO,

On monday, "K:" might be a lot slower than the "M:" because the K drive is
a distant server on a slow network, and the M: is a fast server on the
local subnet.
On tuesday, it perhaps is the opposite because I've flown into the site
hosting the K:.

Perhaps a network file read is slow one minute (on 3G network) and fast
just one minute later (switch on WIFI).

But, the current system doesn't tell me anything about that. Nor does
boost treat the K: files as "special" files just because it *might* be slow.
So I don't see why we should start treating (eg dedup) reparse files any
different.

Speed of a read is an orthogonal issue, and often not something that I can
do something about.
If its going to take 5 minutes to read that Word document off the disk,
then that's what it takes. I can't read that file any other way.
If its a problem for my software, I'll need to read it in a nonblocking
way, with the ability to cancel and show progress etc.
But the simple case is to block on the read, and my users are cool with
that because most software works that way.

cheers,
Paul


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk