[boost] Re: Proposing statsort for Boost.Sort

15 Mar 2026

      Hello everyone,
I'm Francisco Tapia, one of the developers of Boost Sort.
I've been testing statsort,
I used the official Boost Sort code. In the benchmarks folder, there's a
test program to measure the execution times of all single-threaded
algorithms, and another for parallel algorithms.

In the program for single-threaded algorithms, I added statsort. All
algorithms use the same dataset and measure times in the same way. There is
a shell script to run it with GCC and another to run it with CLANG. I’ve
attached the results of both on Ubuntu 25.10

==================================================================
==              B E N C H M A R K   N U M B E R S               ==
==                                                              ==
==                 G C C      C O M P I L E R                   ==
==================================================================

************************************************************
**                                                        **
**               B O O S T      S O R T                   **
**              S I N G L E    T H R E A D                **
**          I N T E G E R    B E N C H M A R K            **
**                                                        **
**        SORT OF 100 000 000 NUMBERS OF 64 BITS          **
**                                                        **
************************************************************

[ 1 ] std::sort [ 2 ] pdqsort          [ 3 ] std::stable_sort
[ 4 ] spinsort  [ 5 ] flat_stable_sort [ 6 ] spreadsort   [7] statsort

                    |      |      |      |      |      |      |      |
                    | [ 1 ]| [ 2 ]| [ 3 ]| [ 4 ]| [ 5 ]| [ 6 ]| [ 7 ]|
--------------------+------+------+------+------+------+------+------+
random              | 8.13 | 3.64 |10.56 |10.82 |11.93 | 5.26 | 5.85 |
                    |      |      |      |      |      |      |      |
sorted              | 1.68 | 0.14 | 2.96 | 0.07 | 0.07 | 0.07 | 2.70 |
sorted + 0.1% end   | 5.75 | 2.65 | 2.98 | 0.57 | 0.36 | 3.97 | 3.04 |
sorted +   1% end   |10.58 | 3.11 | 3.08 | 0.76 | 0.52 | 4.27 | 3.08 |
sorted +  10% end   | 7.37 | 3.83 | 3.98 | 1.62 | 1.53 | 5.13 | 3.74 |
                    |      |      |      |      |      |      |      |
sorted + 0.1% mid   | 3.48 | 2.68 | 3.28 | 2.24 | 2.71 | 2.74 | 3.07 |
sorted +   1% mid   | 3.30 | 2.88 | 3.32 | 2.41 | 3.28 | 2.90 | 3.33 |
sorted +  10% mid   | 4.04 | 3.75 | 4.20 | 3.63 | 4.99 | 3.68 | 3.85 |
                    |      |      |      |      |      |      |      |
reverse sorted      | 1.22 | 0.27 | 3.22 | 0.14 | 0.14 | 1.70 | 2.98 |
rv sorted + 0.1% end| 6.72 | 2.72 | 3.30 | 0.71 | 0.43 | 3.97 | 3.21 |
rv sorted +   1% end| 4.71 | 3.06 | 3.38 | 0.86 | 0.58 | 4.31 | 3.28 |
rv sorted +  10% end| 4.52 | 3.86 | 4.27 | 1.74 | 1.58 | 5.30 | 3.81 |
                    |      |      |      |      |      |      |      |
rv sorted + 0.1% mid| 6.00 | 2.67 | 3.49 | 2.73 | 3.25 | 2.48 | 3.20 |
rv sorted +   1% mid| 3.81 | 2.91 | 3.59 | 2.92 | 3.83 | 2.73 | 3.33 |
rv sorted +  10% mid| 4.51 | 3.69 | 4.41 | 4.13 | 5.29 | 3.72 | 3.93 |
--------------------+------+------+------+------+------+------+------+

==================================================================
==              B E N C H M A R K   N U M B E R S               ==
==                                                              ==
==               C L A N G      C O M P I L E R                 ==
==================================================================

************************************************************
**                                                        **
**               B O O S T      S O R T                   **
**              S I N G L E    T H R E A D                **
**          I N T E G E R    B E N C H M A R K            **
**                                                        **
**        SORT OF 100 000 000 NUMBERS OF 64 BITS          **
**                                                        **
************************************************************

[ 1 ] std::sort [ 2 ] pdqsort          [ 3 ] std::stable_sort
[ 4 ] spinsort  [ 5 ] flat_stable_sort [ 6 ] spreadsort   [7] statsort

                    |      |      |      |      |      |      |      |
                    | [ 1 ]| [ 2 ]| [ 3 ]| [ 4 ]| [ 5 ]| [ 6 ]| [ 7 ]|
--------------------+------+------+------+------+------+------+------+
random              | 7.98 | 3.88 | 8.92 | 7.45 | 8.92 | 4.74 | 5.44 |
                    |      |      |      |      |      |      |      |
sorted              | 1.67 | 0.15 | 5.35 | 0.07 | 0.07 | 0.08 | 3.06 |
sorted + 0.1% end   | 5.45 | 2.72 | 5.37 | 0.73 | 0.41 | 3.69 | 3.24 |
sorted +   1% end   | 9.53 | 3.23 | 5.47 | 1.25 | 0.62 | 3.92 | 3.27 |
sorted +  10% end   | 6.98 | 4.05 | 6.27 | 1.53 | 1.38 | 4.72 | 3.82 |
                    |      |      |      |      |      |      |      |
sorted + 0.1% mid   | 3.13 | 2.77 | 7.42 | 5.82 | 3.02 | 2.24 | 3.24 |
sorted +   1% mid   | 3.10 | 3.01 | 7.88 | 6.30 | 4.24 | 2.41 | 3.33 |
sorted +  10% mid   | 3.90 | 3.84 | 9.16 | 7.33 | 6.26 | 3.11 | 3.88 |
                    |      |      |      |      |      |      |      |
reverse sorted      | 1.18 | 0.28 | 5.59 | 0.14 | 0.14 | 1.33 | 3.30 |
rv sorted + 0.1% end| 6.29 | 2.80 | 5.86 | 0.92 | 0.48 | 3.33 | 3.48 |
rv sorted +   1% end| 4.41 | 3.18 | 5.90 | 1.41 | 0.70 | 3.65 | 3.50 |
rv sorted +  10% end| 4.37 | 4.07 | 6.68 | 1.72 | 1.45 | 4.74 | 3.99 |
                    |      |      |      |      |      |      |      |
rv sorted + 0.1% mid| 5.61 | 2.80 | 7.65 | 6.30 | 3.78 | 1.78 | 3.58 |
rv sorted +   1% mid| 3.67 | 3.12 | 8.10 | 6.87 | 5.06 | 2.14 | 3.63 |
rv sorted +  10% mid| 4.71 | 4.11 | 9.58 | 8.23 | 7.06 | 3.15 | 4.14 |
--------------------+------+------+------+------+------+------+------+

In my opinion, looking at the results, statsort is a good algorithm, but it
doesn’t add anything significant to the library, as can be seen in the
results tables.

I must say I’m glad to see people doing new things and trying to improve on
what we already know. Coming up with a new algorithm is a task that’s
equally hard, demanding, and exciting, which is why I encourage Peter to
keep coming up with new ideas and new algorithms. Good things rarely work
out on the first try

But this is like Formula 1: before the season starts, all the cars look
magnificent, and with the new upgrades, they’re going to be the best.
Reality sets in as soon as the races begin, and we start measuring lap
times.

Best regards,

Francisco Tapia

El dom, 15 mar 2026 a las 5:12, Nigel Stewart via Boost (<
boost@lists.boost.org>) escribió:
...
...
...
I'm Peter Taraba, and I've been developing a new sorting algorithm
called statsort that I'd love to explore adding to Boost.Sort.
Hello Peter,
I've made use of boost::sort in the past, and contributed some minor
things to that over time.
In recent projects I've been particularly interested in sorting very
large (1 to 10 billion) std::vector of uint32_t or uint64_t using
multiple threads and minimal memory overhead.
I think this data is broadly uniform-ish, but often with multiple
dense clusters of similar values.  (3D point clouds)
Might be interesting to see how well the bucketing approach applies to
that?
The key developer for boost::sort is Francisco Tapia, his email
address is available via the fjtapia github profile page.
The main maintainer of boost::sort tends to be responsive to
boost::sort github issues and pull requests.
As mentioned previously I'd be interested in broader benchmarking
results including what boost::sort provides.
Also of interest is the relative memory overhead - quicksort has the
advantage of essentially being in-place, after all.
One thing I'd mention is that with std::sort and boost::sort the
formulation is in terms of a bool comparator function, std::less by
default.
But in this scheme what's needed is a "to scalar float/double
reflecting the required order" function?
Question is how "lumpy" a scalar it takes to go slower than the other
approaches?
Does it make sense to subsample the scalar and adjust the function to
make it more uniform?
Perhaps boost::sort ought to be extended to generating various
datasets that we can use for benchmarking more systematically.
Such as sampling from multiple weighted guassians, for example.
- Nigel Stewart
_______________________________________________
Boost mailing list -- boost@lists.boost.org
To unsubscribe send an email to boost-leave@lists.boost.org
https://lists.boost.org/mailman3/lists/boost.lists.boost.org/
Archived at:
https://lists.boost.org/archives/list/boost@lists.boost.org/message/2VK7VP63...

[boost] Re: Proposing statsort for Boost.Sort

Francisco José Tapia