|
Boost : |
Subject: Re: [boost] [SORT] Parallel Algorithms
From: Steven Ross (spreadsort_at_[hidden])
Date: 2015-04-05 23:13:17
On Sun, Mar 29, 2015 at 6:25 AM Francisco José Tapia <fjtapia_at_[hidden]>
wrote:
> Hi Steven
>
> Thanks by your message.
>
> Until now , I had been focused mainly in the algorithms, and I didn't
> dedicate many time to others things as the test format and the integration
> with boost::sort.Now, the first things to do are :
>
> - Try to detect when the data are ordered,
>
Just verify that each successive element is not less than the previous. If
inputs are unique (or completely identical if they compare the same), then
you can compare results with those of std::sort.
> - Revise in deep your library and code , in order to adapt the test and
> benchmarks to the procedures used in your library
>
> About the object used in the benchmarks, we can do something simple.
> Reburcio is a very small class in the file
> Benchmarks/GCC/algorithm/reburcio.hpp. Please rewrite and send me, and
> only
> need to recompile the benchmarks.
>
I suggest renaming the variables to be clearer, but here's the idea:
template <uint32_t NN>
struct intarray
{
uint32_t M[NN];
bool operator < ( const intarray &R) const{
for (int i = 0; i < NN; ++i) {
if (M[i] != R.M[i]) {
return M[i] < R.M[i];
}
}
return false;
}
};
Use <BOOST_ROOT>/libs/sort/example/randomgen.cpp (b2 libs/sort/randomgen
--tune will build the binary in the libs/sort dir) to generate randomized
input files, and memcpy them into arrays of intarray to fill them with the
data to sort. This will emulate the large non-vector objects the indirect
sorting approach you've designed is optimized for, and will fill the entire
object with randomized contents, along with having a full (default) copy
constructor.
You can use strings in the place of intarray also for testing strings; I
randomize them by just using the >> operator on strings to read them in; it
will automatically break on some characters, causing the strings to have
randomized length.
>
>
> *About the parts to include*
>
> With the parallel unstable sort, all the algorithms I examined have the
> same approach, and due this, similar performances. The range of decision is
> small.
>
> The goal is provide algorithms independent of any library or Operating
> System, fast , robust and easy to use. The idea of to do the same than a
> company is the origin of many Free SW, think about Oracle and MySQL,
> Internet Explorer and Firefox. Even the C++ standard, is to do the same
> than many companies are doing since many years ago, each with its own way.
> The final users are grateful because simplify our work.
>
> TBB is available in Windows, Linux and Os X. With others operating system
> ( by example Android and all the real time OS), you must recompile all the
> source code, and I have not clear about the dynamic linking of TBB in these
> operating systems. Many small machines don't have task scheduler, but have
> threads. To force to recompile TBB for to use a parallel sort is like force
> to rent a bus, for to transport only one person.
>
Who wants to do a parallel sort on Android? The OS often only gives you
one core (depending on priorities), and it would burn the battery at
ridiculous speed. I just don't see the use case, unless you can provide
some advantage in some fairly common use case, and it would have to fully
match tbb's features to within a few percent, including efficiency on
already-sorted input.
> Our code have similar performance, small code, independent of any Operating
> System, of any other code or library. If you are C++11 compliant you have
> parallel sort. I think, we must include sort, parallel sort, stable sort
> and parallel stable sort. Perhaps, too, sample sort, but it's less
> important
>
> Your stable sort and parallel stable sort look promising, and sample sort
is a possibility. I recommend concentrating on them for now.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk