Boost logo

Boost :

Subject: Re: [boost] Any interest in hashing algorithms SHA and/or FNV1a?
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2013-11-13 13:18:01


On 13 Nov 2013 at 12:48, Jeff Flinn wrote:

> > Be aware that AFIO (currently in the peer review queue) is slowly
> > gaining an asynchronous batch hash engine. It's a bit different from
> > your normal hash engine, because it can do things like use SIMD to
> > process four SHA256 streams in parallel, and then using AFIO's
> > closure engine to parallelise that 4-SHA256 processing across
> > multiple cores i.e. achieve sixteen parallel SHA256 processing
> > streams. This lets one drop the normal 14.9 cycles/byte down to
> > around 1.4 cycles/byte amortised for SHA256 [1], a big win. The batch
> > hash engine uses a compile-time plugin system, so it can be
> > arbitrarily extended with other hash implementations if suitably
> > rewritten to fit.
>
> 0Sounds interesting. Is the hash engine customizable to composite
> algorithms? Particularly MD5/SHA1?

The API currently works by you instantiating a hash_engine template
with the hash you want e.g. hash_engine<SHA256>. In the resulting
class instance you have a member function for creating new hashes
which returns a vector of fresh handles, a batch function for
enqueuing extra memory ranges to a hash handle which returns a vector
of futures representing the completed processing of that memory
range, and if you pass in a null pointer block it enqueues
termination of the hash. Each hash handle provides a future which
becomes ready when the hash is terminated.

There is absolutely nothing stopping anyone from creating a
hash_engine<MD5> and a hash_engine<SHA1> and feeding incoming memory
ranges to both engines. Enqueuing ranges is instantaneous, as hashing
occurs asynchronously. I would assume the MD5 engine would likely
complete before the SHA1 engine, but that's easily coped with thanks
to the futures.

> I'm blocked from using AFIO(on Windows) for my file processing as I need
> to go thru BackupRead/BackupWrite api's which are explicitly
> incompatible with overlapped io. I haven't looked if AFIO could offer
> benefits with that restriction.

AFIO *always* opens all handles with backup semantics, because AFIO
understands symlinks on Windows and treats them (nearly) identically
to POSIX. If you go direct to the NT kernel API, the kernel provides
a lovely function NtQueryInformationFile which when used with the
FILE_ALL_INFORMATION class returns all possible metadata about a file
and another function NtSetInformationFile which can set all possible
metadata about a file. All you need to do above that is to query and
store extended attributes and the security object, again both of
which are trivial to do when using the NT kernel API directly.

My point is that it is relatively easy to write your own BackupRead
and BackupWrite functions, if you skip the Win32 layer and go
straight to the kernel. BTW I would be more than happy to accept any
patches adding such a feature to AFIO because BackupRead and
BackupWrite are seriously broken according to Microsoft themselves :)

Niall

-- 
Currently unemployed and looking for work.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk