Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2001-09-07 10:17:48


At 09:00 AM 9/7/2001, williamkempf_at_[hidden] wrote:

>> I'll be interested in your research. MS people that know the
>details claim
>> that a critical section is incredibly optimized and should beat a
>mutex in
>> any reasonable scenario.
>
>It wasn't my research. It was research that Alexander Terekhov
>(spelled from memory, sorry if I buthered it) found on the net. At
>some point I'll find the time to find the link and post it here for
>everyone to check out. The findings surprised me greatly, because
>I've read the MS claims and seen several instances where the
>performance boost of critical sections was easily observable.

We need to be VERY careful about interpreting Win32 timings. My personal
observations:

* Timings on Win 9X often differ a lot from the same tests on Win
NT/2K. Presumably we should give much more weight to NT/2K timings as it
is supposed to be the surviving code base.

* The compiler sometimes matters. Why this should be so for native Win32
API calls is not obvious to me, but at least in some tests a few years ago
it has mattered a great deal for some calls.

And of course for timings on any platform:

* The runtime library sometimes matters. For Boost.Threads this isn't an
issue (since presumably the multi-thread safe version of runtime libraries
are always used) but it is something to keep in mind for general timings.

* For many compilers and libraries, there is a vast difference between
Release and Debug build timings. That seems so obvious as to not be worth
stating, but there may be less obvious issues. For example, the Metrowerks
compiler in release mode doesn't cause NDEBUG to be defined! If you don't
realize this or forget to do it yourself, performance can really suffer.

* The size of the test needs to be typical, or better yet repeated for
differing sizes. Bjarne Stroustrup says he distrusts timings that don't
show a knee in the performance curve for larger tests; it may mean all data
was in the cache.

* Better timings may not matter. If a given function is always used in
real programs in conjunction with other code that takes 100 times longer to
execute, how much faster you make that function doesn't matter at all.

* The number of CPU's in relation to the number of threads involved may
totally alter timings. At the very least I'd hope any timing tests used to
make important decisions were run on both single-processor and
dual-processor CPU's, and with differing numbers of threads contending for
a shared resource.

Win 2K question: Is there an easy way on a multi-processor machine to
restrict all threads in a program to the same CPU?

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk