Boost logo

Boost :

From: Herve Bronnimann (hbr_at_[hidden])
Date: 2002-05-22 17:59:35


On Wed, May 22, 2002 at 05:05:02PM -0400, Herve Bronnimann wrote:
> Dietmar, Dirk, others: are there any plans to take care of the beta
> priority queues?
 
It seems one of the things that were holding Dietmar's library in the
beta state was radix_heap, which were not finished... perhaps this
would be better handled in a library of its own, to keep priority queues
comparison-based.

This brings me to a slightly different topic: There is a whole world of
algorithms and data structures that operate on types of fixed width,
which often replace the inevitable log n for comparison-based algorithms
and data structures, by log log n or even 1.

For instance, has anybody given much thought to counting_sort and radix_sort?
I've heard that some sophisticated implementations of the C++ STL
actually specialize std::sort() for the value types for which it is
possible... but I don't have an example myself.

Specifically, to pick an example, I'd like to write things like:

   typedef <any builtin integral type on one byte> counting_type;
   std::vector< counting_type > v;

and use either:

   counting_sort( v.begin(), v.end(), result ); // or
   inplace_counting_sort( v.begin(), v.end() ); // or
   indirect_counting_sort( v.begin(), v.end(), result );
   // result's value type being an iterator for indirect_sort

or even, if T is any type and project a function object whose result
type is a counting_type:

   counting_sort( v.begin(), v.end(), result, project ); // or
   inplace_counting_sort( v.begin(), v.end(), project ); // or
   indirect_counting_sort( v.begin(), v.end(), result, project );

Similarly, I'd like to have, for the following types:

   typedef <any builtin integral type on one byte> radix_type;
                                 // looked at as a counting_type[4], or
   typedef counting_type radix_type[N]; // or
   typedef boost::array<T,N> radix_type; // or
   typedef boost::tuple<counting_type,...> radix_type; // or
   typedef std::string radix_type; // or
   typedef std::pair<counting_type,counting_type> radix_type;

   std::vector< radix_type > v;

and simply call:

   radix_sort( v.begin(), v.end(), result ); // or
   inplace_radix_sort( v.begin(), v.end() ); // or
   indirect_radix_sort( v.begin(), v.end(), result );

I put a short file in the files section, directory linear_sort,
to illustrate that the approach could work. The results?
Sorting a million elements in a vector<TYPE> (g++ -O3) gets you:

TYPE bool
   std::sort 1.3s
   boost::inplace_counting_sort 0.12s
   boost::counting_sort 0.16s
TYPE char
   std::sort 0.19s
   boost::inplace_counting_sort 0.09s
   boost::counting_sort 0.07s
TYPE signed char
   std::sort 0.15s
   boost::inplace_counting_sort 0.08s
   boost::counting_sort 0.07s
TYPE unsigned char
   std::sort 0.19s
   boost::inplace_counting_sort 0.05s
   boost::counting_sort 0.04s
TYPE char[1]
   std::sort w/ lexico compare 1.16s
   boost::radix_sort 0.11s
TYPE char[3]
   std::sort w/ lexico compare 1.82s
   boost::radix_sort 0.47s
TYPE char[5]
   std::sort w/ lexico compare 2.75s
   boost::radix_sort 1.07s
TYPE unsigned char[1]
   std::sort w/ lexico compare 1.29s
   boost::radix_sort 0.07s
TYPE unsigned char[3]
   std::sort w/ lexico compare 1.72s
   boost::radix_sort 0.35s
TYPE unsigned char[5]
   std::sort w/ lexico compare 2.07s
   boost::radix_sort 0.82s

Any comments, feedback, improvements?

-- 
Hervé

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk