Subject: Re: [boost] Fwd: [SORT] Parallel Algorithms
From: Steven Ross (spreadsort_at_[hidden])
Date: 2015-06-10 22:42:53
On Mon, Jun 8, 2015 at 4:44 PM Francisco JosÃ© Tapia <fjtapia_at_[hidden]>
> I changed the name of the examples. Now are :
> - example_single_thread
> - example_default_threads
> - example_100_threads
> I will prepare a sort benchmark for to run in 5 minutes. I will do a new
> program, because if begin to include options in the programs, can be long
> and can be confuse for the users.
<= 5 minutes. If you can create a consistent benchmark that runs in a
minute, even better, but I've usually found really short benchmarks (<1
minute) tend to be noisy.
> Timsort in Windows, sort the data, but the time and memory used are much
> greater than the used in Linux. I think the best option is to remove of the
> Windows benchmarks.
Sounds good to me.
> I will examine the licenses, and I think only need to specify the license
> of each part of the code because I only use, don't modify.
If you're willing to give me a link for all the licenses for software you
haven't written but are including along with the code, I'd appreciate it,
so I can double-check.
> The benchmarks programs are different in Windows and Linux, because the
> algorithms compared are different. In Windows don't have the GCC
> algorithms, and in Linux don't have the PPL algorithms.
Ok; please try to share everything that's practical.
> About the int_array objects. The constructor and the operator = function
> with an 64 bits integer as parameter, are functions used before, but don't
> used now. The operator << is the same.
Please delete them then. They confused me, and I don't want to maintain or
answer questions about any unused code.
> The reason for to implement a
> complex comparison function, is because in a sort algorithm the two main
> functions are move and compare. I want to implement the worst case, big
> objects hard to move and hard to compare.
I'm ok with that, but please mention it in your documentation if you don't
> About the optimization of the code I use the -O3 optimization parameter
> for all the code. The TBB parallel stable sort have 4 different versions (
> Open-MP, CilkPlus, TBB high level and TBB low level), the fastest is TBB
> low level and due this is the used in the benchmarks.
> In the attached file (parallel_objects.txt) you can see the results in my
> machine with all kind of data (sorted, reverse sorted, random, random
> %50000000, random % 10000, and equal elements).
Those numbers look good! If the numbers on my test system (dual-boot
Windows and Linux) look anything like that, I say it's good enough to
include. In particular, I appreciate that you added the already-sorted
optimization; that's a surprisingly common case. I've seen duplicate sorts
(they sorted the data 2 times, every time) deployed to hundreds of
thousands of computers; nobody noticed until I looked because the sort
routine had an optimization for already-sorted data, and the duplicate sort
only took 1% of the total sort time.
> About to use the code for to parallelize spreadsort, feel free for to use.
> And if you need help , need any question or have any problem, please, say
> When you decide to do , say me and I will prepare a description of the two
> approach used in order to decide the best for your code, and how to
> implement. Some things are complex to understand reading only the code.
Not today. Let's get your algorithms debugged, documented, and into Boost
before we try anything that complicated.
Please let me know when you're done with your updates and are ready for me
to review the documentation and code.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk