Subject: Re: [boost] [SORT] Parallel Algorithms
From: Steven Ross (spreadsort_at_[hidden])
Date: 2015-04-06 06:53:57
On Mon, Apr 6, 2015 at 1:55 AM, Aditya Atluri <abcd_at_[hidden]> wrote:
> It's not just CPUs. Most of the physics engines used in games for phones are CPU based. The current multi threaded CPU architecture brings gamin to a whole new level. I see some physics done with sort, map, gather, scatter, and all.
> PS: I proposed this as a part of boost compute this GSoC giving GPU support for iOS devices using metal shading language.
If you can find somebody whose frame rate on mobile is significantly
impacted by CPU sorting speed, I'd love to chat with them.
>> On Apr 6, 2015, at 1:37 AM, Ben Pope <benpope81_at_[hidden]> wrote:
>>> On Monday, April 06, 2015 11:13 AM, Steven Ross wrote:
>>> Who wants to do a parallel sort on Android? The OS often only gives you
>>> one core (depending on priorities), and it would burn the battery at
>>> ridiculous speed.
>> I was under the impression that it's better on battery life to use all the cores at maximum and then sleep as quickly as possible. Clock gating of the components is getting better, but for example (and I'm completely making this up now), if two CPUs share a cache, then the cache is alive anyway so you may as well use the other CPU. I've heard the term "race to sleep" to describe this.
>> I think it's hard to guess at; mobile CPUs now have 8 cores, and they're not even homogeneous, some are in order execution and some out of order.
The term race-to-sleep applies to faster but higher-power processors
sometimes being more power efficient because the system can go to
sleep faster. That said, as modern mobile systems normally operate
with all but one core asleep, they are optimized for the single-core
use case, and very little power is wasted on the unused CPUs. When
you add in that parallel sorting has 75% of the CPU efficiency, it
just doesn't make sense from a power perspective especially if some
other task can be computed in parallel with the sort, and as there is
a single-threaded option (spreadsort) that can sort 1.5-2X faster on
just one core.