|
Boost : |
From: Rene Rivera (grafik666_at_[hidden])
Date: 2002-07-31 09:47:52
[2002-07-31] Gennaro Prota wrote:
>On Tue, 30 Jul 2002 23:05:58 -0500, Rene Rivera
><grafik666_at_[hidden]> wrote:
>
>
>>
>>In game programming bit counting has been reinvented many times and so far
>>the fastest, most portable way to do this doesn't require any tables and
is
>>a fixed set of operations per bit (or byte, or word, depending on the
>>granularity you think in).
>>
>>This counts the 1 bits on an 8 bit byte:
>>
>>byte x = <some_bit_value>;
>>byte a = (x & 0xAA) >> 1;
>>byte b = (x & 0x55);
>>x = a+b;
>>a = (x & 0xCC) >> 2;
>>b = (x & 0x33);
>>x = a+b;
>>a = (x & 0xF0) >> 4;
>>b = (x & 0x0F);
>>x = a+b;
>>
>>x == <the bit count>
>>
>>At 12 operations per byte, or 3/2 per bit.
>>
>
>
>Well, it can be surprising that *on many processors* it is *on
>average* (slightly) faster than the classical (but more general)
>
>int my_count(byte n) {
>
> int i = 0;
> int num = n;
> while (num) {
> num &= (num-1);
> ++i;
> }
> return i;
>
>}
>
>
>but, frankly, I don't see how it could be faster than a plain
>
>int my_count(byte n) {
> return table[n];
>}
As Anatoliy pointed out it's only faster when extended to more bits in
parallel. But I've found that it's faster in the 32 bit case and up.
The speedup is mostly due to keeping the computation in the ALU, as opposed
to doing memory access.
But if you are forced to do it a byte at a time, the table would be faster.
-- grafik - Don't Assume Anything
-- rrivera_at_[hidden] - grafik_at_[hidden]
-- 102708583_at_icq - Grafik666_at_AIM - Grafik_at_[hidden]
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk