|
Boost : |
Subject: Re: [boost] sorting floats by casting to integer
From: Steven Ross (spreadsort_at_[hidden])
Date: 2009-07-06 13:20:04
On Mon, Jul 6, 2009 at 8:37 AM, Vladimir Prus <vladimir_at_[hidden]>wrote:
> > With 23-bit random data, all
>
> the exponents are the same. With 32-bit random data, there will be 512
> > different values for the exponents,
>
> Are you sure? 32-bit IEEE format has 8 bits for the exponent.
+1 for sign.
> > splitting the identical-exponent
> > problems much smaller, to the point where they are close to the size of
> the
> > cache (in-cache sorting is extremely fast).
> > With realistic data, there will be multiple exponents, but not as many as
> > there are data elements. More likely, there will be a small range of
> > exponents, and for each of them, many elements with the same exponent.
> > Also, a denial-of-service attack could easily be made against a system
> that
> > sort floats by just feeding it floats with the same exponent, but
> different
> > coefficients.
> > When I test with 26-bit random data, where there are 8 different exponent
> > values (a more realistic case), here are my results:
> > sorting floats as integers (my approach): 1.473s
> > std::sort 21.511s
> > (I used randomgen 10 16 10000000 to generate the input data)
> >
> > So not 120X, but still a substantial difference.
> >
> > Also worth noting is that when the element count increases, the speed
> > difference increases, even on 32-bit random data.
> > randomgen 16 16 200000000:
>
> What real-world problems require sorting such huge arrays of floats?
>
Circuit layouts, if someone wants to sort them for various purposes. They
are gigabytes in size, so they take a long time to sort unless casting to
integer is done.
Yes, I guess I was using denormalized numbers for the 23-bit case, but it's
interesting that there is such a huge speed difference, and I doubt it's
because they're denormalized; there is no reason to special-case
denormalized sorting.
Notably, normalized numbers also show the same problem if their exponent
range is small relative to the number of elements; this can be achieved by
using a small number of different exponents, or simply a large number of
floats (as shown with my large test). What do you consider a reasonable
range? I can easily make a test that excludes denormalized numbers.
Even if the full exponent range is in use (which isn't realistic), wouldn't
it be better to use an algorithm that doesn't blow up when n gets much
larger than 10 million elements, and is still faster with a smaller number
of elements?
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk