Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2008-06-28 14:09:14

Steven Ross wrote:
> Would there be interest in an integer_sort call that is faster than std::sort?
> Spreadsort is an in-place hybrid radix-based and comparison-based algorithm with
> better worst-case and average-case (roughly 20-50%) performance than introsort
> on random integer data.
> The worst-case advantage is algorithmic, with Spreadsort taking the better of
> O(n(square_root(bit_sizeof(data) - log(n)))) and O(nlog(n)) time.
> For 32-bit integers, that's O(n(square_root(32 - log(n)))),
> so for a log(n) of 16 (65536 items), it's 4n (plus a small constant times n),
> vs 16n for introsort.
> For larger n, the advantage vs. pure comparison-based algorithms increases.
> In effect, Spreadsort smoothly transitions from pure MSB Radix sort to std::sort
> with its data cut up by a couple iterations of MSB Radix sort
> (reducing the size of the log(n) factor),
> depending upon the distribution and size of the data.
> I could write a templated version of Spreadsort for the Boost library if there
> is interest. It would work for anything that supports the << and - operators.

Yes, this would interest me. I have a fixed-point class that it could
probably be used with. I have a feeling that you need a little bit
more than just << and - though, don't you? Hmm, what would a (pure)
radix sort need? Presumably, since yours is a hybrid method, you need
everything that a comparison sort needs and everything that a radix
sort needs. Maybe we should start by implementing a (sketch of a)
radix sort template.

Johan mentioned sorting a vector of [pointers to] structs by some
member of that struct; I also have some wrappers around std::sort (e.g.
below) that do this and would be interested to see how you would
implement it.

template <typename obj_t, typename data_member_ptr_t>
struct compare_data_member {
   data_member_ptr_t data_member_ptr;
   compare_data_member(data_member_ptr_t data_member_ptr_):
   bool operator()(const obj_t& a, const obj_t& b)
     return a.*data_member_ptr < b.*data_member_ptr;

template <typename iter_t, typename data_member_ptr_t>
void sort_by(iter_t begin, iter_t end, data_member_ptr_t
data_member_ptr) {
             compare_data_member<typename iter_t::value_type,
                                 data_member_ptr_t> (data_member_ptr));

> It could also handle standard floats sorted as integers.

what exactly do you mean by this? As I understand it, if you treat an
IEEE float as an integer for sorting it will almost work but the
negative range will be backwards; you need to do some sign-mag vs.
2s-comp fixup to allow for it. But I bet std::c++ doesn't guarantee
anything along those lines.

> Functors supporting << and - could make it more general.
> Multi-dimensional sorting is also sped up by Spreadsort.
> I could also write a similar string_sort, that should also be significantly
> faster than std::sort. This would look much like postal sort, just using
> introsort after a certain number of iterations.
> Is this of interest?

Yes, that too. I'm particularly interested in
algorithmically-efficient sorting of strings that frequently have
common prefixes, e.g. filenames, URLs etc; I think would be a good case
for this algorithm, wouldn't it?


I've recenty been re-coding something where a std::map<> was using more
memory than I wanted. I decided to instead put the data in a
std::vector, sort it, and then search using the std::lower/upper_bound
algorithms. This works well, but I did notice std::sort using a lot of
time in the profile. So an improved sort would be appreciated.

This is off-topic but people might be interested: there are a couple of
remaining problems with this sorted vector implementation. Firstly,
the access pattern when you search is horrible; it starts by reading
one element in the middle, then one at the 1/4 or 3/4 point, and so on;
probably each of those is a page or cache-line that will never be
accessed otherwise. This can be fixed by bit-reversed addressing
(well, at least in the case where the vector size is a power of two, I
think) and I'm considering a container-adaptor that bit-reverses the
index. It makes normal iteration slower of course. The other problem
is that if my data is a key-value pair, I waste a lot of cache space
for the value parts of the elements that I only access the keys of
(i.e. the nodes near the root of the tree). I could fix this by
replacing the vector<pair<key,value>> with a
pair<vector<key>,vector<value>>, but can I somehow get the memory
layout of the latter with the syntax of the former?

Regards, Phil

Boost list run by bdawes at, gregod at, cpdaniel at, john at