Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2008-07-03 12:36:24


Steven Ross wrote:
> On Thu, Jul 3, 2008 at 5:58 AM, Phil Endecott <spam_from_boost_dev_at_[hidden]> wrote:
>>
>> I've done a simple benchmark with about 2 million latitude values scaled to
>> ints in the range +/-90 x 1e6. The input is approximately alphabetically
>> ordered by country and then by placename within the country. Your algorithm
>> seems to be about 13% faster than std::sort. This is with g++-4.3 -O3.
>
>
> Thanks for testing it. That's a little slower than I'd expect (but
> believable). It's possible that the default tuning values don't work well
> on your system. Are you running on Windows?

No, this is Linux. It's a VIA C3, so it probably has less cache than
many other systems.

> If you can
> afford to leave it running for an hour or so, you might also try running
> tune.pl -skip_verify 2000000 to see if it changes the constants on your
> system, and rerun with the constants tune.pl generates.

I'll have a look at that. What do we think about "magic tuning
parameters" in algorithms? My feeling is that things dependent on e.g.
cache sizes (or number of processors etc) really need to be determined
at run time to be useful to most people.

> If I understand you correctly you're sorting 2M elements over a 28-bit
> range.

Something like that, yes.

> Most of my performance tests are with 40M elements over a 32-bit
> range, and my experience is that the speedup gradually increases between 10K
> and 10M elements, in roughly this fashion:
> 10K no difference
> 100K 10% faster
> 1M 20% faster (not too far from your result)
> 10M 30% faster
> (and 50+% speedup on some systems or with some types of data and large
> datasets)

Hmm. Actually my application is more typically sorting of the order of
1e4 to 1e5 elements, within a smaller range.

> I assume you are comparing times that the application displays, not total
> time which includes file I/O.

This is the elapsed (wall-clock) time between calling SpreadSort and it
returning. No I/O.

> If it isn't proprietary data, I'd also be
> willing to look at your testcase to see if there is something I can improve
> in either my algorithm or your testing of it.

OK, I've put a 14 Mbyte file called placenames.dat.bz2 in the directory
http://chezphil.org/tmp/ (I've not written the URL in full to stop
search engines from wasting their time on it). When un-bziped you'll
find a tab-delimited 3-column text file. I got basically the same
results using the latitude and longitude columns.

> Spreadsort is designed for large n. If you have small n, std::sort is
> better, and between 1K and 1M, the speedup isn't much. If you have 100M
> elements, then it starts to make a significant difference. That's the
> reason for the initial check to see if the number of elements is less than
> some constant value, in which case it uses std::sort.

I'm unsure how many real-world problems there are that sort tens or
hundreds of millions of elements in main memory. 1e8 elements takes ~
a gigabyte if you have just one pointer associated with it. Sorting
this size of dataset from disk is a more common, but entirely
different, problem. (But of course there are whole problem domains out
there that I don't know anything about...)

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk