|
Boost : |
Subject: Re: [boost] Boost.MapReduce: what next?
From: Craig Henderson (cdm.henderson_at_[hidden])
Date: 2009-08-31 14:53:36
Cory Nelson said
> People have been splitting tasks
> into multiple operations and combining the results on single machines
> for ages -- MapReduce doesn't really offer any innovation there.
Well, it provides a very easy framework for implementing parallel algorithms. Mulithreading is hard and often done very badly - MR simplifies the task tremendously.
> The
> innovation, and the buzz about it, is that it offers a reliable,
> general-purpose, and large-scale distributed implementation of this
> very basic idea. If you can accomplish that in this library, I think
> there will be _a lot_ more interest.
>
> I think a lot of the MapReduce buzz also has to do with the services
> tied to it that further ease common scalability bottlenecks, the big
> ones being Google File System and BigTable. It's really just part of
> the bigger ecosystem.
Agreed - the difficulty is in defining where a library ends and the infrastructure begins. This library cannot (and should not, IMO) explode into a distributed file system (extension to Boost.FileSystem) & communications library (Boost.MPI or Boost.ASIO based). This is the MapReduce algorithm to sit upon other infrastructure to provide an overall solution.
-- Craig
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk