Boost logo

Boost :

Subject: Re: [boost] [SORT] Parallel Algorithms
From: Francisco José Tapia (fjtapia_at_[hidden])
Date: 2015-03-24 16:56:22


Hi ,

 In the file of this link, in my dropbox space, you have the code, the test
programs and the benchmarks with GCC.

 https://dl.dropboxusercontent.com/u/8437476/works/sort/Sort_V0.1.7z

 *FILE DESCRIPTION*

 In the root folder of the file you have a INSTRUCTIONS.txt file detailing
the content of all the folders of the file, information for to compile the
programs, where is each file, and which options and libraries you must link.

 *Test*

Contains the test programs of each class and algorithms.

 *Src*

Contains the source code of all. In the folder *src/boost/sort* there is a
file *algorithm.hpp*, which is the only include file needed for invoke the
algorithms.

 The parallel algorithms have a parameter of the NThread type which
indicate the number of threads to use. By default use the same number than
the HW threads of the machine. it's a very simple structure defined in
boost/sort/util/atomic.hpp.

 *Benchmarks*

In this folder are the benchmark code with GCC compiler under Linux x64. In
the future I will add benchmarks with other compiler and Operating Systems.

 The folders *cpp-TimSort-master*, and *parallel_stable_sort* have the C++
code of the TimSort algorithm, found in
https://travis-ci.org/gfx/cpp-TimSort, and the code of the sample sort with
TBB.

 The code had been modified, due to the four versions use the same name
space and many common names, and was not possible invoke the four versions
in the same program. The modification put each version in a different name
space.

 In the folder *GCC*, you can find 3 folders

   -

   *algorithm* : Have the code of the benchmark programs
   -

   *bin *: have the compiled and static linked version of the benchmark
   programs. In this folder you have a shell run.sh which run sequentially the
   programs and collect the results obtained in .txt files with the same name
   than the program
   -

   *Results *: in this folder you can find the result generated by the
   benchmark programs on different machines. Each machine have a folder where
   you can find the text files with the same name than the programs with the
   results, and a description.txt file with the description of the HW where
   run the benchmarks. The format used in this folder is the suggested for to
   present the results of all the machines used in the test. Here you have the
   result of my two computers, and can have a first idea about the results
   expected in other machines.

The benchmarks, like in the old version, have a version with 64 bits
numbers, and a version with objects of several sizes.

 About the memory used for of the algorithms. If you consider M the memory
used by the data, the total memory used by the algorithms are :

   -

   GCC sort → M
   -

   boost sort → M
   -

   GCC parallel_sort → M
   -

   boost parallel_sort → M
   -

   TBB parallel_sort → M

   -

   GCC stable_sort → 2 M
   -

   boost stable_sort → 1.5 M
   -

   TimSort -> 2M
   -

   GCC parallel_stable_sort → 2.5 M
   -

   TBB parallel_sort → 2 M
   -

   boost sample_sort → 2 M
   -

   boost parallel_stable_sort → 1.5 M

The algorithms use the indirect version when the size of the objects is
greater than 128 bytes.

 I am speaking with several friends, for to run the benchmarks in more
powerful machines.

 I think, the first is to check the speed. If the speed is good, we have
something useful , if not, we must decide what must do with this. If
something is useful , or we must leave all.

 If we decide continue, the next steps to do are :

Reorganize the benchmark, and include benchmarks with string objects. Your
idea about the use of files, I thing is the correct way to follow

Finish all the documentation of the code for to generate a Doxygen
documentation

Generate the web pages of documentation for the Boost project

Modify the test programs for to do according the bjam files

 If you found any problem , error or doubt, please , contact me for to
resolve

 Francisco


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk