|
Boost : |
From: Gennaro Prota (gennaro_prota_at_[hidden])
Date: 2003-01-21 11:01:26
Ok. We've really put a lot of irons in the fire now, so please let me
explain.
First of all I want to point out the purpose of my first post: though
it is specifically addressed static_log2 it is actually a reaction to
a general trend of boost, i.e. that of having complicated solutions
for simple problems. I don't say this is the case for all libraries
(also because I do not know every single boost library, and I didn't
even know the Integer library until Dave indicated it to me a few days
ago) but in many cases boost has implementations enormously more
complicated than necessary.
This state of affairs makes me often revert to my own code for things
that are already into boost. Speaking of static_log2 I really didn't
understand why we should have such a complicated implementation, and
that's why I made a provocation with the stupid snippet I posted
initially. The point was that even such a stupid implementation is not
only simpler but competes in performances with the current one. I'm
well aware that mine is "bad", but the point was exactly that I don't
see how the current one is better (also considering in how much
trouble it goes due to its use of pts, etc..etc..). Also I want to
make clear that this is not an attack to Daryle or to anyone. It may
well be that there's a good reason to have the implementation we have
and that *I* am just missing it. If not, however, I think we should
change it. I see often that people who are very expert in a given
field tend to find solutions more complicated than inexpert people,
even when the implied complexity buys nothing. To me instead
simplicity is important as much as other aspects, and should be
abandoned _only_ when it comes, for instance, at correctness' or
completeness' cost.
I'm under the impression that the current implementation has been
chosen a little ingenuously, without thinking how overkilling it is
for such a simple task. Of course, and I repeat myself, my
"implementation" is the shame of any C++ programmer, but it was just a
provocation. Now here are the results of your shell script on my
machine:
boost::static_log2.hpp
real 0m49.182s
user 0m23.610s
sys 0m22.621s
vesa impl (starting from n = 16):
real 0m33.693s
user 0m10.280s
sys 0m22.140s
"my" trivial implementation:
real 0m31.667s
user 0m8.040s
sys 0m22.120s
Of course there are the result on a particular compiler on a
particular platform, but they make us think.
Now I'll reply to the other points, which are, to say, a digression
from the main topic, because they simply address some aspects of the
specific implementation you posted.
On Tue, 21 Jan 2003 11:56:31 +0200, "Vesa Karvonen"
<vesa_karvonen_at_[hidden]> wrote:
>Gennaro Prota:
>>Vesa Karvonen:
>> > enum {c = (1 << n) <= x};
>
>The above should preferably read:
>
> enum {c = (1ul << n) <= c};
I think you are missing something here, maybe the important fact that
an integer type (unsigned long included) may have padding bits or the
rules about shift operators. In this case you can still have undefined
behavior on an implementation where unsigned long occupies 64 bits but
32 of them are padding (i.e. when unsigned long has a precision of
32). I do know that it is unlikely but this is boost. Please read
paragraph 5.8/1 of the standard:
"The behavior is undefined if the right operand is negative,
or greater than or equal to the length in bits of the promoted
left operand."
The meaning of "length in bits" isn't defined in the standard (hence
the DR I hinted at) but, believe me, it is the number of bits that
contribute to the value, not the number of bits in the object
representation.
>>Wrong. The left-shift may give undefined behavior, because n can be
>>greater than or equal to the number of value bits in an unsigned long
>[...]
>>The problem is that the C standard (and so C++) provides a xx_BIT macro
>>only for the type char. You don't have e.g. INT_BIT or LONG_BIT, and you
>>can't use CHAR_BIT*sizeof(int) and CHAR_BIT*sizeof(long) in their place,
>>at least not portably, because those expressions give you the number of
>>bits in the object representation, and that's not necessarily the same
>>as the number of bits used to represent values.
>
>In this case it is not necessary to know the exact number of value bits.
>What is required of n is that it should be a power of two (not guaranteed by
>the simple formula used) and (1ul << n) should be less than ULONG_MAX and
>((1ul << n) << n) should be less than ULONG_MAX. A value that would fulfill
>these requirements can be computed rather easily (and also quite
>efficiently, as you don't have to start guessing from 0 or even 1). I'll
>leave the construction of such a metafunction as an exercise for the reader.
>
>>PS: Another problem with your implementation I'll leave it as an
>>exercise ;-) This is a hint:
>
>As you seem to have discovered, there is no such problem.
Proof please? There's such a problem When the width w (i.e. the number
of value bits) of unsigned long is odd, there are no padding bits, and
n is of the form (2**w)-1: since you start from n/2 == (w-1)/2 then
you reach n == 0 *before* reaching x == 0. I know no C++
implementation where width is odd but, again, this is *boost*. We must
not make unportable assumptions.
>[...]
>Also, I must point out that timings from one-shot compilations are highly
>unreliable. You should average the timings from multiple runs:
Done (but just for the sake of discussion, as I said)
>[...]
>You should also be very careful about what you are measuring. In your test,
>you repeated the same assertions twice.
I made that intentionally, because I thought someone would have
objected that, in real code, it's unlikely that so many different
instantiations are made and/or it's unlikely that each of them is made
once.
>You should be aware that the second
>time around, on most reasonable c++ compilers, none of those templates are
>really recursed for the second time, because the fully constructed classes
>have been cached.
I know. Either because I've experimented myself and because I've read
Dave's paper in the boost files ;-)
>Another problem is that the usage pattern employed in your test is not very
>reasonable. static_log2<> is a metafunction that will be used in the
>implementation of other metafunctions. Most of the time it is not used in
>the top-level and the parameter given to it is dependent on some other
>template parameter. Try using the following tester template (where LOG2 is a
>macro expanding to the name of the tested static_log2 implementation):
>
>template<int i>
>struct tester {
> BOOST_STATIC_CONSTANT(int,
> value =
> LOG2<i+(1<<0)>::value +
> LOG2<i+(1<<1)>::value +
> LOG2<i+(1<<2)>::value +
> LOG2<i+(1<<3)>::value +
> LOG2<i+(1<<4)>::value +
> LOG2<i+(1<<5)>::value +
> LOG2<i+(1<<6)>::value +
> LOG2<i+(1<<7)>::value +
> LOG2<i+(1<<8)>::value +
> LOG2<i+(1<<9)>::value +
> LOG2<i+(1<<10)>::value +
> LOG2<i+(1<<11)>::value +
> LOG2<i+(1<<12)>::value +
> LOG2<i+(1<<13)>::value +
> LOG2<i+(1<<14)>::value +
> LOG2<i+(1<<15)>::value +
> LOG2<i+(1<<16)>::value +
> LOG2<i+(1<<17)>::value +
> LOG2<i+(1<<18)>::value +
> LOG2<i+(1<<19)>::value +
> LOG2<i+(1<<20)>::value +
> LOG2<i+(1<<21)>::value +
> LOG2<i+(1<<22)>::value +
> LOG2<i+(1<<23)>::value +
> LOG2<i+(1<<24)>::value +
> LOG2<i+(1<<25)>::value +
> LOG2<i+(1<<26)>::value +
> LOG2<i+(1<<27)>::value +
> LOG2<i+(1<<28)>::value +
> LOG2<i+(1<<29)>::value);
>};
>
>On g++ 2.96, by the way, the expression
>
> tester<1>::value + tester<2>::value,
>
>fails to compile on default settings with the naïve algorithm, because it
>exceeds the maximum template recursion depth.
I know. In some tests that I did with g++ 2.95 I tried either bumping
the maximum with -ftemplate-depth and adding specializations at
"regular intervals", i.e.
template <> struct log2<16> ..
template <> struct log2<256> ...
template <> struct log2<4096> ...
In both cases the timings were practically the same (g++ 2.95 under
Cygwin)
Genny.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk