Boost logo

Boost :

Subject: Re: [boost] Integer sorting
From: Steven Ross (spreadsort_at_[hidden])
Date: 2019-01-21 14:47:34


Fenil,

Based on the description, this looks like spreadsort without the worst case
analysis, and with a new swapping optimization (I know there is room for
improvement in the swapping). I expect this algorithm will perform
significantly worse than std::sort (or pdqsort) in the worst-case
situations, without applying similar worst-case analysis. spreadsort is
comparable to std::sort in the worst case distribution for spreadsort
because of careful analysis and threshold optimization. If you tweaked the
spreadsort constants
<https://github.com/boostorg/sort/blob/master/include/boost/sort/spreadsort/detail/constants.hpp>
I
think you should be able to get comparable performance to your algorithm,
minus the impact of your swapping optimization.
How does your swapping optimization perform on mostly-sorted data?
spreadsort does well in this common case relative to std::sort, which is
somewhat tricky with in-place radix sorting.
You should try seeing how your algorithm performs on the distributions in
https://github.com/boostorg/sort/blob/master/example/alrbreaker.cpp and
https://github.com/boostorg/sort/blob/master/example/binaryalrbreaker.cpp,
tweaking those to hit your worst case.
What's the memory overhead?
What's the threshold size at which it starts becoming faster than std::sort
and pdqsort? comparison-based sorting is surprisingly fast on small lists
of integers that fit on L1 cache.
How well does it handle common prefixes in string sorting? string_sort is
optimized to handle that case extremely quickly. It's a serious
performance issue if you don't optimize for it.

I wish more people realized the generality of these algorithms; if you
really want a fast sort, you can use spreadsort (or an equivalent
algorithm), you just need to define a way to index into the key. This was
published a while ago in Engineering Radix Sort, but most people use
comparison sorts because they're easier to use (and for most applications,
sorting compute time is minor).

On Mon, Jan 21, 2019, 3:29 AM Francisco José Tapia via Boost <
boost_at_[hidden]> wrote:

> Hi Fenil,
>
>
> I am Francisco Tapia, one of the maintainers of the Sort Library.
>
> I had been watching your code, and find it interesting. But give me a week
> to say you something. These days I am very busy, and I expect to have time
> the next weekend, and examine and test your code.
>
> Yours
>
>
> Francisco Tapia
>
> El lun., 21 ene. 2019 a las 8:41, Richard Hodges via Boost (<
> boost_at_[hidden]>) escribió:
>
> > On Sat, 29 Dec 2018 at 03:36, Fenil Mehta via Boost <
> boost_at_[hidden]
> > >
> > wrote:
> >
> > > I have written a sorting algorithm which is way faster than std::sort
> for
> > > all array size.
> >
> >
> > "Extraordinary claims require extraordinary evidence" - Carl Sagan
> >
> > https://en.wikipedia.org/wiki/Sagan_standard
> >
> >
> >
> >
> > > The link is as follows:
> > > https://github.com/fenilgmehta/Fastest-Integer-Sort
> > >
> > > Some guidance would be appreciated on how to improve the project
> > structure
> > > and the steps to get it included in boost.
> > >
> > > Regards,
> > > Fenil Mehta
> > >
> > > _______________________________________________
> > > Unsubscribe & other changes:
> > > http://lists.boost.org/mailman/listinfo.cgi/boost
> > >
> >
> >
> > --
> > Richard Hodges
> > hodges.r_at_[hidden]
> > office: +442032898513
> > home: +376841522
> > mobile: +376380212 (this will be *expensive* outside Andorra!)
> > skype: madmongo
> > facebook: hodges.r
> >
> > _______________________________________________
> > Unsubscribe & other changes:
> > http://lists.boost.org/mailman/listinfo.cgi/boost
> >
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk