Boost logo

Boost :

Subject: Re: [boost] [Boost-users] What's so cool about Boost.MPI?
From: Sid Sacek (ssacek_at_[hidden])
Date: 2010-11-11 23:26:05


> I also disagree with the statement that communication is faster than
> computation. Even if you have 10 Gb/second networks into a compute node,
> that corresponds only to about 150 M double precision floating point numbers.
> Lets connect that to a node with a *single* quad core Nehalem CPU that
> operates at actually measured sustained speeds of 74 Gflop, and you see that
> the network is 500 times slower. Using 4 such CPUs on a quad-core node brings
> the performance ratio to 2000! Even 10 times faster networks will only take
> this down to a factor of 200.
> Thus, in contrast to your statements networks are *not* an order or two magnitudes
> faster than computers but two or three orders of magnitude slower than the compute
> nodes. This will only get worse by an additional two orders of magnitude once we
> add GPUs or other future accelerator chips to the nodes.

Wow! You picked (cherry-picked) a very particular data type, and then performed a simple division between the FPU speed and the incoming data rate.

There are so many things that occur in the CPU before you can process network data. Like NIC interrupts to the drivers, driver interrupt processing, drivers signaling the running processes, task swaps, page faults, paging, cache flushes, cache updates, data transfers between buffers two to five times before it is processed, endian conversions, programs switching on key data bytes to call the proper procedures to process the data, the processed data then being used to trigger new actions, etc...

A much better algorithm to use for calculating performance is to determine how many assembly instructions do you anticipate it will take to process a single byte of data. Data comes in infinite forms. Before the FPU gets a crack at the data, it has to pass though the CPU.

Think about it... the data coming in from the network isn't being fed straight into your FPU hardware and the results being tossed away.

My experience in "network data" processing is very different from yours.

-Sid


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk