Interest check: Boost.Algorithm.Statsort — O(n log log n) sorting
On Mar 9, 2026, at 3:08 PM, Peter Taraba via Boost <boost@lists.boost.org> wrote:
Hi Boost community,
I'd like to gauge interest in a new algorithm library: **boost::algorithm::statsort**, a sorting algorithm with O(n log log n) average complexity on smooth distributions (uniform, Gaussian, exponential), compared to O(n log n) for std::sort.
I am interested in learning more about this. Thanks for the link to the paper. However, I suspect that “Boost.Sort” (https://github.com/boostorg/sort) might be a better place. But if that doesn’t work out, we can discuss including it in Boost.Algorithm. — Marshall (Boost.Algorithm maintainer)
**The algorithm** The idea is simple: instead of comparing elements, we use their values to place them directly into one of sqrt(n) buckets via linear interpolation, then recurse. Using sqrt(n) (variable) rather than a fixed bucket count reduces the recursion depth from O(log n) to O(log log n). This is described in detail in the accompanying publication: Peter Taraba, "Why would you sort when you know where things approximately belong?", March 2025.
https://www.authorea.com/users/495213/articles/1240078-why-would-you-sort-wh...
**Measured performance** (GCC -O3, x86-64 Linux): n=1,000,000 uniform: 84ms → 28ms (3.0x faster) n=1,000,000 Gaussian: 82ms → 34ms (2.4x faster) n=1,000,000 exponential: 84ms → 24ms (3.5x faster) n=10,000,000 uniform: 990ms → 466ms (2.1x faster)
**API** (Boost-style, header-only, C++17): // Container overload std::vector<double> v = { ... }; boost::algorithm::statsort(v); // Iterator overload boost::algorithm::statsort(v.begin(), v.end()); Supports any std::is_arithmetic<T> type (int, float, double, etc.)
Is there interest in this for Boost.Algorithm? Happy to post the preliminary submission to the vault and refine based on feedback.
Best, Peter https://orcid.org/0000-0002-8199-3723 https://randommathguy.blogspot.com/
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/7QK457FV...
Hey Peter, I am interested, but I need more information about `statsort`: Could you give more examples for the algorithm? I am confused about real use cases in the Boost ecosystem. What about its trade-offs? On another note: I like paper by the way, its got the graph, examples, and benchmarks. Which makes it quantifiable. Best, Amlal
On 3/9/26 23:08, Peter Taraba via Boost wrote:
Supports any std::is_arithmetic<T> type (int, float, double, etc.)
I don't think I've ever sorted a container of plain numbers in a real program. However, I have sorted by complex objects by their numeric attributes fairly often, i.e.: struct my_complex_type { std::string name; int z; }; std::vector<my_complex_type> my_vector; std::sort( my_vector.begin(), my_vector.end(), [](auto const &a, auto const &b) { return a.z < b.z; }); Can statsort work with this kind of object? -- Rainer Deyke - rainerd@eldwood.com
On 3/11/26 19:59, Peter Taraba via Boost wrote:
I've added projection overloads to statsort that address exactly the pattern you described: boost::algorithm::statsort(my_vector, [](const my_complex_type& x) { return x.z; }); The projection must return an arithmetic type, which is then used as the sort key. Both the container and iterator interfaces support it: // Container overload boost::algorithm::statsort(my_vector, [](const my_complex_type& x) { return x.z; }); // Iterator overload boost::algorithm::statsort(my_vector.begin(), my_vector.end(), [](const my_complex_type& x) { return x.z; }); The updated code is on GitHub: https://github.com/drpt78/statsort Does this cover your use case? Happy to hear if there are other patterns I should consider.
Yes, this is exactly what I was looking for. -- Rainer Deyke - rainerd@eldwood.com
participants (4)
-
Amlal El Mahrouss -
Marshall Clow -
Peter Taraba -
Rainer Deyke