Boost logo

Boost :

Subject: Re: [boost] Integer sorting
From: Fenil Mehta (fenilgmehta_at_[hidden])
Date: 2019-02-19 18:59:56


While working on the corrections suggested on my sorting project, I tried
to test the swapping optimization on timsort and compared it with other
sorting algorithms.
I used "benchmark_objects.cpp" file in the boost library for the testing
purpose.

I have attached the results in this email.

Summary -
When arrays comparison is done by calculating the sum of all the numbers of
the array, following was the result:
1) timsort with swapping optimization gives the best result, total time =
17.3, average time = 1.15
2) flat_stable_sort gave the second best result, total time = 33 seconds,
average time = 2.2 seconds

When arrays comparison is done between the first element of the array, as a
key, following was the result:
1) timsort with swapping optimization gives the best result, total time =
3.52 seconds, average time = 0.23 seconds
2) pdqsort gave the second best result, total time = 4.6 seconds, average
time = 0.31 seconds

I have not yet pushed the swapping function on my GitHub. Please let me
know if any queries.

Regards,
Fenil Mehta

On Tue, Feb 12, 2019 at 9:21 AM Steven Ross via Boost <boost_at_[hidden]>
wrote:

> On Sat, Feb 9, 2019 at 11:45 PM degski via Boost <boost_at_[hidden]>
> wrote:
> >
> > On Sat, 9 Feb 2019 at 14:39, Francisco José Tapia via Boost <
> > boost_at_[hidden]> wrote:
> >
> > > The results con ska_sort are in the attached file numbers_02.txt
> > >
> >
> > Thanks for adding this [and it was worthwhile I would say], this is a
> great
> > table and shows ska_sort is weak on sorted data [maybe that's fixable],
> but
> > otherwise better than [Boost] Spreadsort and Fastest-Integer-Sort [sic]
> and
> > overall fastest on random data.
> >
> > I say you the same things as Fenil. This is an external test of the
> > > algorithm. The internal test is other question very different ...
> > >
> >
> > I don't understand why the "internal test" [unless you mean testing for
> > bugs] has any relevance.
>
> It looks like ska_sort is a radix sort designed mostly for the
> best-case, and that if you feed it the alrbreaker or binaryalrbreaker
> distributions it will perform poorly (though not as poorly as radix
> implementations with no fallback or that fallback to insertionsort).
> There might be some counting and swapping optimizations here to copy
> into spreadsort (I see block operations), though those might not work
> if the items being sorted are references or pointers.
>
> Mostly-sorted data is a common case to take seriously, and it took
> only modest tweaking of spreadsort to do well there.
>
> Based on Francisco's numbers, I should retune spreadsort to take
> advantage of the higher speed of pdqsort vs std::sort, at least if I
> want to optimize for 64-bit integer sorting on modern processors.
>
> https://www.boost.org/doc/libs/1_67_0/boost/sort/spreadsort/detail/constants.hpp
> A higher log_min_split_count (especially) and log_finishing_count is
> probably appropriate.
>
> >
> > degski
> > PS: Please don't top-post, it's just so messy.
> > --
> > *“If something cannot go on forever, it will stop" - Herbert Stein*
> >
> > _______________________________________________
> > Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk