Boost logo

Boost Users :

Subject: Re: [Boost-users] [filesystem] problem: is_regular_file and deduplified files (reparse+sparse)
From: Paul Harris (harris.pc_at_[hidden])
Date: 2015-07-23 03:11:06


FYI, I followed the blog article,
then once the machine was "running" I clicked Connect at the bottom.
That gave me an .rdp file which in theory I could use with rdesktop, but it
uses a DNS name that was only just created, so that didn't work.

When you click the name of the server in the list, it shows the public IP
on the right.. and the port
then you can do this
$ rdesktop that.ip.addr:port

But only if you have the latest rdesktop AND you have set up kerberos
something-something.

Instead I found a windows computer and used remote desktop from there.

---
Once inside,
in the "Server Manager --> Dashboard" window on the screen, click "Add
Roles"
then go next next until "Server Roles"
expand "File and Storage services" , "File and iSCSI" , and tick "Data
Deduplication"
Then next next etc and Install.
Wait a bit... and its done.
http://www.techrepublic.com/blog/data-center/configuring-windows-server-8-deduplication/
---
Continuing on that webpage...
Time to enable dedup.  There is a temp disk D: so lets enable there.
Method 1... I did this and then went to method 2... Start PowerShell, type:
"Enable-DedupVolume D:"
Method 2... in that same Dashboard, hit the 4th button (File and Storage
Services)
Then Volumes --> Disks
click Volume 1 at the top, and then right click D: at the bottom -->
Configure Dedup.
To try and accelerate this puppy, I set the "age to dedup" to 0 days.
http://www.techrepublic.com/blog/data-center/windows-server-2012-deduplication-how-and-where-to-tweak/
---
Time to make something to dedup.  We'll just duplicate the warning.txt file
that exists on D:
In powershell:
PS> D:
PS> $file = Get-Content DATALOSS_WARNING_README.txt
Then, do these 2 commands a bunch of times until "big.txt" gets to say 6MB
PS> Add-Content big.txt $file
PS> $file = Get-Content big.txt
Then use windows explorer (or other) to make a dozen copies of big.txt
Copy c:\windows\explorer.exe to D:
to give it something to dedup
Go to D: and then copy-paste explorer.exe a dozen times.
In PowerShell, type:
PS> Update-DedupStatus -Volume D:
PS> Start-DedupStatus -Type Optimization -Volume D:
and then wait for it to finish.
you can track its progress with:
PS> Get-DedupJob
PS> Get-DedupStatus -Volume D:
---
So, once its deduped, you check.
PS> FSUTIL REPARSEPOINT QUERY big.txt
you should see that its a reparse point with that 0x800etc0013 code.
Copy-paste big.txt to big2.txt and check it with the query, and it should
tell you big2 is NOT a reparse point.
NOW TO TEST !
----
On 23 July 2015 at 13:57, Paul Harris <harris.pc_at_[hidden]> wrote:
> Hi Niall, you can use the Azure to test this sort of thing... I think.
> I'm trying it out now.
>
> http://blogs.technet.com/b/tommypatterson/p/azureservertrial.aspx
>
>
> On 23 July 2015 at 09:54, Niall Douglas <s_sourceforge_at_[hidden]> wrote:
>
>> On 23 Jul 2015 at 8:56, Paul Harris wrote:
>>
>> > With this server comes the new "dedup" feature, that can automatically
>> > deduplify files.  This happens on a schedule, eg 2am saturday.  So
>> suddenly
>> > we are getting messages of failures of software from all over the place,
>> > due to fs::is_regular_file()
>> >
>> > Deduped files have the REPARSE and SPARSE flag set.
>> > On the command line, you can run
>> > FSUTIL REPARSEPOINT QUERY
>> >
>> > and the "Reparse Tag Value" is 0x80000013
>> >
>> > Which is a relatively new flag known as IO_REPARSE_TAG_DEDUP
>> >
>> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365740%28v=vs.85%29.aspx
>> >
>> > These files act as normal files, you can fopen and fread them, so I
>> assume
>> > they should be treated almost like symlink by boost... perhaps not
>> quite a
>> > symlink because I assume the "lstat" link properties are identical to
>> the
>> > file's stat properties.
>> >
>> >
>> > Typically, I iterate over directories and only process files if
>> > fs::is_regular_file(filename) is true.
>> >
>> > I wrote some code to check what the properties were on these files, and
>> its
>> > not any of the possible enums detected by file_status::type().
>> >
>> > ideas?
>>
>> Proposed Boost.AFIO doesn't support IO_REPARSE_TAG_DEDUP because I
>> have no access to any system to test the support upon.
>>
>> However, if AFIO were to support IO_REPARSE_TAG_DEDUP, it would treat
>> it identically to a symlink/junction point.
>>
>> I'd suggest Boost.Filesystem do the same, and treat pseudo-symlinks
>> as symlinks. That probably means adding full symlink support for
>> Filesystem on Windows. Here are some links to example implementation
>> code:
>>
>> Reading a symlink target:
>> https://github.com/BoostGSoC13/boost.afio/blob/master/include/boost/af
>> io/v2/detail/impl/afio_iocp.ipp#L511
>> <https://github.com/BoostGSoC13/boost.afio/blob/master/include/boost/afio/v2/detail/impl/afio_iocp.ipp#L511>
>>
>> Writing a symlink:
>> https://github.com/BoostGSoC13/boost.afio/blob/master/include/boost/af
>> io/v2/detail/impl/afio_iocp.ipp#L848
>> <https://github.com/BoostGSoC13/boost.afio/blob/master/include/boost/afio/v2/detail/impl/afio_iocp.ipp#L848>
>>
>> Obviously best not allow rewriting a pseudo-symlink like
>> IO_REPARSE_TAG_DEDUP, make it read only.
>>
>> Niall
>>
>> --
>> ned Productions Limited Consulting
>> http://www.nedproductions.biz/
>> http://ie.linkedin.com/in/nialldouglas/
>>
>>
>>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>
>
>


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net