Hi all,
 
Following the feedback on this thread, I've pushed two substantive updates to the statsort repository:
https://github.com/drpt78/statsort

1. Boost-convention directory layout
The repository is now structured as a proper Boost library:
  include/boost/algorithm/statsort.hpp   (canonical install path)
  test/statsort_test.cpp
  test/Jamfile.v2                        (b2 / Boost CI support)
  example/statsort_example.cpp
  benchmark/statsort_bench.cpp
  CMakeLists.txt
The Jamfile covers all test cases, so the library can now be validated through Boost's b2-based CI pipeline as well as CMake.

2. Imbalance fallback — now actually implemented
The README previously claimed that statsort falls back to std::sort on pathologically skewed inputs. That claim was not reflected in the code. It is now.
After the bucket-counting pass, both the plain and projection overloads check whether any single bucket received more than 50% of the elements. If so, std::sort is called on that subrange immediately and recursion stops. The threshold is exposed as a named constant (STATSORT_IMBALANCE_RATIO) in the detail namespace. This guarantees O(n log n) worst-case behaviour rather than O(n²) degradation on adversarial inputs such as spike distributions or tightly clustered data.
Four new test cases exercise this path directly: spike distribution, two-cluster input, geometric sequence, and the projection-overload variant of the spike case. All 27 tests pass.

3. Reproducible benchmarks
The benchmark source is now included under benchmark/statsort_bench.cpp. It covers uniform, Gaussian, exponential, nearly-sorted, and spike distributions, and uses median-of-7 timing to reduce noise. The spike row makes the fallback behaviour explicitly visible in the numbers. Build with -DSTATSORT_BUILD_BENCHMARKS=ON.
Happy to hear any further feedback.
 
Hope everyone is going to enjoy Albert's Pi day tomorrow and have a nice weekend.
 
Best,
Peter & Claude AI
https://orcid.org/0000-0002-8199-3723
 
 
Sent: Wednesday, March 11, 2026 at 2:15 PM
From: "Rainer Deyke via Boost" <boost@lists.boost.org>
To: boost@lists.boost.org
Cc: "Rainer Deyke" <rdeyke@gmail.com>
Subject: [boost] Re: Interest check: Boost.Algorithm.Statsort — O(n log log n) sorting
On 3/11/26 19:59, Peter Taraba via Boost wrote:
> I've added projection overloads to statsort that address exactly the pattern you
> described:
> boost::algorithm::statsort(my_vector,
> [](const my_complex_type& x) { return x.z; });
> The projection must return an arithmetic type, which is then used as the sort
> key. Both the container and iterator interfaces support it:
> // Container overload
> boost::algorithm::statsort(my_vector,
> [](const my_complex_type& x) { return x.z; });
> // Iterator overload
> boost::algorithm::statsort(my_vector.begin(), my_vector.end(),
> [](const my_complex_type& x) { return x.z; });
> The updated code is on GitHub: https://github.com/drpt78/statsort
> Does this cover your use case? Happy to hear if there are other patterns I
> should consider.

Yes, this is exactly what I was looking for.


--
Rainer Deyke - rainerd@eldwood.com

_______________________________________________
Boost mailing list -- boost@lists.boost.org
To unsubscribe send an email to boost-leave@lists.boost.org
https://lists.boost.org/mailman3/lists/boost.lists.boost.org/
Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/4SDR6QO6NMEXOXYNWYQ64K4T2UVADS5L/