Boost logo

Boost-Build :

From: Jurko Gospodnetiæ (jurko.gospodnetic_at_[hidden])
Date: 2008-04-11 17:25:05


   Hi.

>> Time to change my vote and say that the patch should be applied to
>> the trunk. :-) And possibly a test added based on the latest example to
>> guard against any regressions. Volodya, should I do this or should we
>> wait for more opinions?
>
> I actually wanted to figure out why the current algorithm performs that
> bad, but I'll have little time next week, and that's more of a theoretical
> question. I think getting this in is best at this point. Thanks!

   Patch itself has been committed.

   Changeset link: http://svn.boost.org/trac/boost/changeset/44195

   Don't have time to add the test right now, but will try to do it
today or later this weekend.

   BTW, I've thinking over the original merge sort algorithm a bit
(while waiting in my car for a friend to finish her shopping :-), so
take my thoughts with a grain of salt... :-)) and something does seem
strange about it... Merge sort gets its logarithmic behavior by
splitting the original sequence into two separate smaller ones, sorting
them separately and then merging them back. However, the original
list_sort() did not quite do it that way but instead split the list into
two sequences (using a heuristic) and then merged them both together and
repeated those two steps until the list got sorted. In the problematic
1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 case it would do the
following:

split (requiring a pass through the whole list):
1,2,1,2,1,2,1,2,1,2,1,2,1,2
1,2,1,2,1,2,1,2,1,2,1,2

merged them together (requiring a pass through the whole list):
1,1,2,1,2,1,2,1,2,1,2,1,2,2,1,2,1,2,1,2,1,2,1,2,1,2

split (requiring a pass through the whole list):
1,1,2,1,2,1,2,1,2,1,2,1,2
1,2,1,2,1,2,2,1,2,1,2,1,2

merged them together (requiring a pass through the whole list):
1,1,1,2,1,2,1,2,2,1,2,1,2,1,2,2,1,2,1,2,1,2,1,2,1,2

...

   As you can see, the algorithm is not doing much progress each step
and is pretty much going to need as n (or n/2) such 2-phase steps, each
of the steps requiring 2 linear (n) passes... Merge sort really should
not merge the sequences in each step but only after such sequences get
sorted themselves.

   qsort() has similar worst-case scenarios (e.g. reverse sorted
sequences if I remember correctly) which require O(n^2) steps, however
my bet would be that the standard library algorithm had been well
implemented and makes such cases highly unlikely and handled differently
when detected.

   Hope this helps.

   Best regards,
     Jurko Gospodnetiæ


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk