
Boost : 
From: Steven Ross (spreadsort_at_[hidden])
Date: 20080730 14:12:50
On Tue, Jul 29, 2008 at 8:39 PM, Steven Watanabe <watanabesj_at_[hidden]>wrote:
> Steven Ross wrote:
>
>> A functor is required because of the cast operation; I don't see any
>> generic
>> way to code a general cast operation.
>>
>
> How about using a traits template?
That's worth considering.
Can I use is_floating_point to mean that the entire value_type can be safely
cast to an integer?
I could cast a builtin float/double type to a builtin signed integer type
of the same size, but this wouldn't work for userdefined data types being
sorted on a float key (unless I cast the result of their userdefined >>
operator, which would be weird), so effectively it would be a specialcase
for builtin data types.
float_sort works by taking a signedmagnitude floating point number,
casting it to a 2's complement integer, and then reversing the negative
buckets to fix the order.
Since the usage that I can come up with for a functorfree version is so
specialized, I would prefer to skip it, but if the application is more
general than I imagine, then I'll consider it seriously.
I've thought about some prior suggestions for making
integer_sort/float_sort/string_sort apply to any forward_iterator, and came
to this conclusion:
Spreadsort (the underlying algorithm) is optimized on the assumption that
memory allocation is slow and best minimized, random accesses are somewhat
(but not greatly) slower than comparisons, and comparisons are slower than
iterating across the provided iterators.
What should be done if these assumptions are substantially off:
If memory allocation is fast and memory reasonably available, use an LSB
radix sort or the original noninplace variant of Spreadsort.
If random accesses are really slow, an entirely different algorithm is
called for, such as the file sorting version of Spreadsort (not open
source), or Mergesort.
If random accesses are fast, MAX_SPLITS can be eliminated (or made infinite)
and Spreadsort will run in theta(n) time on continuous integrable function
distributions.
If comparisons are extremely slow, use a pure radix sort, such as LSB radix
sort; Spreadsort isn't optimized for this situation; if they're just a
little slow, tune Spreadsort to prefer radix sorting.
If iteration is slow then MAX_SPLITS should be eliminated, and the
noninplace Spreadsort should probably be used.
Based upon these optimization assumptions, I don't think my inplace
Spreadsort algorithm is appropriate for nonrandomaccess sorting in a
situation where a programmer is using an appropriate data structure for
their problem; a different and probably specialized algorithm would be
called for.
The basic idea behind integer_sort/float_sort/string_sort is to provide
hybrid generic algorithms that are substantially faster for most common
sorting problems than std::sort. The assumption is that for more
specialized tasks, a different offtheshelf algorithm will probably be
better.
Do you recommend any particular reference on traits, such as Mr. Abraham's
book?
The standard states that, "For the algorithms to work correctly,
> comp has to induce a strict weak ordering on the values."
> This is not true for NaNs, therefore, IMO, you don't need to worry
> about them.
>
Thanks for checking, Steve, I won't worry about them then.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk