Boost logo

Boost :

Subject: Re: [boost] Proposal: MapReduce library (single machine)
From: Craig Henderson (cdm.henderson_at_[hidden])
Date: 2009-06-15 15:45:13


> Interesting for sure. However, how is the execution back-end handled?
> What would I have to provide to create a custom job dispatcher? For
> example, I'd like to use the Vista ThreadPool as well as a custom one
> for this; is this possible? There is a policy for scheduling, so I guess
> the answer is yes, but I'd like to see what the requirements are.

There are two functions provided to run a map and reduce task from a
scheduler:
void run_next_map_task(detail::job_interface &job, results &result,
boost::mutex &m);
void run_next_reduce_task(detail::job_interface &job, unsigned &partition,
results &result, boost::mutex &m);

The scheduler can call these from any thread, so is just responsible for the
thread creation and management, providing the mapreduce runtime with
timings, and consolidation of results from each thread.

The entire cpu_parallel scheduler is just 40 lines (including { } braces
alone on lines). This could probably be refactored further to reduce the
implementation overhead, but 40 lines isn't much code.

-- Craig


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk