I'm Peter Taraba, and I've been developing a new sorting algorithm called statsort that I'd love to explore adding to Boost.Sort.
Hello Peter, I've made use of boost::sort in the past, and contributed some minor things to that over time. In recent projects I've been particularly interested in sorting very large (1 to 10 billion) std::vector of uint32_t or uint64_t using multiple threads and minimal memory overhead. I think this data is broadly uniform-ish, but often with multiple dense clusters of similar values. (3D point clouds) Might be interesting to see how well the bucketing approach applies to that? The key developer for boost::sort is Francisco Tapia, his email address is available via the fjtapia github profile page. The main maintainer of boost::sort tends to be responsive to boost::sort github issues and pull requests. As mentioned previously I'd be interested in broader benchmarking results including what boost::sort provides. Also of interest is the relative memory overhead - quicksort has the advantage of essentially being in-place, after all. One thing I'd mention is that with std::sort and boost::sort the formulation is in terms of a bool comparator function, std::less by default. But in this scheme what's needed is a "to scalar float/double reflecting the required order" function? Question is how "lumpy" a scalar it takes to go slower than the other approaches? Does it make sense to subsample the scalar and adjust the function to make it more uniform? Perhaps boost::sort ought to be extended to generating various datasets that we can use for benchmarking more systematically. Such as sampling from multiple weighted guassians, for example. - Nigel Stewart