[boost] Re: Proposing statsort for Boost.Sort

15 Mar 2026

      ...
...
I'm Peter Taraba, and I've been developing a new sorting algorithm called statsort that I'd love to explore adding to Boost.Sort.
Hello Peter,

I've made use of boost::sort in the past, and contributed some minor
things to that over time.

In recent projects I've been particularly interested in sorting very
large (1 to 10 billion) std::vector of uint32_t or uint64_t using
multiple threads and minimal memory overhead.
I think this data is broadly uniform-ish, but often with multiple
dense clusters of similar values.  (3D point clouds)
Might be interesting to see how well the bucketing approach applies to that?

The key developer for boost::sort is Francisco Tapia, his email
address is available via the fjtapia github profile page.
The main maintainer of boost::sort tends to be responsive to
boost::sort github issues and pull requests.

As mentioned previously I'd be interested in broader benchmarking
results including what boost::sort provides.
Also of interest is the relative memory overhead - quicksort has the
advantage of essentially being in-place, after all.

One thing I'd mention is that with std::sort and boost::sort the
formulation is in terms of a bool comparator function, std::less by
default.
But in this scheme what's needed is a "to scalar float/double
reflecting the required order" function?
Question is how "lumpy" a scalar it takes to go slower than the other
approaches?
Does it make sense to subsample the scalar and adjust the function to
make it more uniform?

Perhaps boost::sort ought to be extended to generating various
datasets that we can use for benchmarking more systematically.
Such as sampling from multiple weighted guassians, for example.

- Nigel Stewart