Boost logo

Boost :

Subject: Re: [boost] Proposal: MapReduce library (single machine)
From: Craig Henderson (cdm.henderson_at_[hidden])
Date: 2009-06-16 02:27:54


> > I'm running some tests and will update the site with performance
> > comparisons shortly
> >
> Great

I've posted metrics from three runs of WordCount on a ~10Gb dataset at
http://www.craighenderson.co.uk/mapreduce/

Scalability is not linear, as you would expect, as there is contention in
reading the files from 8 or 16 threads simultaneously. This is where
multi-machine MapReduce clearly comes into its own - assuming the data is
distributed with a decent replication filesystem.

-- Craig


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk