|
Boost : |
From: Vesa Karvonen (vesa_karvonen_at_[hidden])
Date: 2003-01-21 04:56:31
Gennaro Prota:
>Vesa Karvonen:
> > enum {c = (1 << n) <= x};
The above should preferably read:
enum {c = (1ul << n) <= c};
>Wrong. The left-shift may give undefined behavior, because n can be
>greater than or equal to the number of value bits in an unsigned long
[...]
>The problem is that the C standard (and so C++) provides a xx_BIT macro
>only for the type char. You don't have e.g. INT_BIT or LONG_BIT, and you
>can't use CHAR_BIT*sizeof(int) and CHAR_BIT*sizeof(long) in their place,
>at least not portably, because those expressions give you the number of
>bits in the object representation, and that's not necessarily the same
>as the number of bits used to represent values.
In this case it is not necessary to know the exact number of value bits.
What is required of n is that it should be a power of two (not guaranteed by
the simple formula used) and (1ul << n) should be less than ULONG_MAX and
((1ul << n) << n) should be less than ULONG_MAX. A value that would fulfill
these requirements can be computed rather easily (and also quite
efficiently, as you don't have to start guessing from 0 or even 1). I'll
leave the construction of such a metafunction as an exercise for the reader.
>PS: Another problem with your implementation I'll leave it as an
>exercise ;-) This is a hint:
As you seem to have discovered, there is no such problem.
...
The problem with the naïve algorithm is that it requires as many recursions
as the position of the highest bit (+1) in the operand. For 32 bit numbers,
you may need 32 recursions. The less naïve algorithm only requires 5 =
log2(32) recursions. (Now, repeat the reasoning for 64 bit numbers.) This
can make a big difference in more complicated template metaprograms, because
the maximum template recursion depth is often limited. So, there are
situations in which the naïve algorithm would simply fail (to compile at
all), but the less naïve would work. This is the reason why the naïve
algorithm should not be used. It has almost nothing to do with the
(compile-)time it takes to compute the function.
...
Also, I must point out that timings from one-shot compilations are highly
unreliable. You should average the timings from multiple runs:
#!/bin/bash
function test {
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
g++ -I ../boost -DIMPL=$1 test.cpp
}
time test 1
time test 2
time test 3
You should also be very careful about what you are measuring. In your test,
you repeated the same assertions twice. You should be aware that the second
time around, on most reasonable c++ compilers, none of those templates are
really recursed for the second time, because the fully constructed classes
have been cached.
Another problem is that the usage pattern employed in your test is not very
reasonable. static_log2<> is a metafunction that will be used in the
implementation of other metafunctions. Most of the time it is not used in
the top-level and the parameter given to it is dependent on some other
template parameter. Try using the following tester template (where LOG2 is a
macro expanding to the name of the tested static_log2 implementation):
template<int i>
struct tester {
BOOST_STATIC_CONSTANT(int,
value =
LOG2<i+(1<<0)>::value +
LOG2<i+(1<<1)>::value +
LOG2<i+(1<<2)>::value +
LOG2<i+(1<<3)>::value +
LOG2<i+(1<<4)>::value +
LOG2<i+(1<<5)>::value +
LOG2<i+(1<<6)>::value +
LOG2<i+(1<<7)>::value +
LOG2<i+(1<<8)>::value +
LOG2<i+(1<<9)>::value +
LOG2<i+(1<<10)>::value +
LOG2<i+(1<<11)>::value +
LOG2<i+(1<<12)>::value +
LOG2<i+(1<<13)>::value +
LOG2<i+(1<<14)>::value +
LOG2<i+(1<<15)>::value +
LOG2<i+(1<<16)>::value +
LOG2<i+(1<<17)>::value +
LOG2<i+(1<<18)>::value +
LOG2<i+(1<<19)>::value +
LOG2<i+(1<<20)>::value +
LOG2<i+(1<<21)>::value +
LOG2<i+(1<<22)>::value +
LOG2<i+(1<<23)>::value +
LOG2<i+(1<<24)>::value +
LOG2<i+(1<<25)>::value +
LOG2<i+(1<<26)>::value +
LOG2<i+(1<<27)>::value +
LOG2<i+(1<<28)>::value +
LOG2<i+(1<<29)>::value);
};
On g++ 2.96, by the way, the expression
tester<1>::value + tester<2>::value,
fails to compile on default settings with the naïve algorithm, because it
exceeds the maximum template recursion depth. The other two algorithms work
correctly.
-Vesa Karvonen
_________________________________________________________________
Help STOP SPAM: Try the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk