Boost logo

Boost :

From: Malte Skarupke (malteskarupke_at_[hidden])
Date: 2020-01-23 18:34:56


> If I understood correctly, your new algorithm increases the requirement
> for Boost.Sort to C++17 from C++11. I don't see a good reason for doing
> that. You can replace if constexpr and the generic lambdas in your
> implementation with other code constructs that work on C++11.

It only requires C++17 if you want to use ska_sort. The other algorithms remain at C++11. I agree that this is a little unfortunate, but when I briefly tried to get rid of "if constexpr" in the code, it turned out to be a giant pain. It would also probably make the code compile much more slowly. If you are trying to sort nested types, like vector<vector<pair<string, int>>>, compile time can explode without some of the short circuiting that I'm doing using if constexpr. (I can explain why it explodes if you're interested, but for now I'll just say that it works different from comparison based sorting and you have to instantiate new code at every level of depth) I know I can do the same short circuiting without if constexpr if I put in enough work, but I think the code quality would suffer a lot, making the code much harder to read and maintain. (not that it's simple as is, but it would go from "hard to read" to "impossible to read")

> Regarding the interface, shouldn't it follow the example of spreadsort?

The main benefit of this over spreadsort is that it's more generic. The spreadsort interface is hard to generalize to other types. For example it allows you to provide your own shift operator. But how do you provide a shift operator when trying to sort a pair<string, int>? When you get to the end of the string, you'd have to start providing bits from the int, but that's going to be annoyingly complicated to write. (especially since I'm not sure if spreadsort works on byte boundaries)

It gets even more complicated when trying to sort a variant<string, int>. Because now the bits depend on which value of the variant is active, but you don't want to have to check that over and over again.

The interface was very important for making this more general. When it comes to this kind of decisions, I have to make as many of them invisible for the user as possible, because only the internals of the sorting algorithm can know which bits to sort on next.

About providing a single header: That is definitely what I would do as a standalone library, but I think in boost it's OK to split this into multiple headers. Users can include boost/sort/ska_sort/ska_sort.hpp which includes all necessary headers. So usually they don't care that it's multiple headers internally. And if you want slightly faster compile time, you can just include ska_sort_base.hpp without the headers that provide tuple or variant implementations. (this is currently not part of the documentation because I'm not sure if I want to provide that forever)

So why make this part of boost instead of a standalone header? Basically I like having the assurance that it'll be kept working for a long time. I'm willing to do that and to put future work into this, but having it in boost ensures that that continues to happen even after I can no longer provide support.

-- 
  Malte Skarupke
  malteskarupke_at_[hidden]

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk