Boost logo

Boost :

From: Andy Glew (glew_at_[hidden])
Date: 1999-07-01 20:48:25


As somebody who has worked on the architectural definition of the Intel
timestamp counter and RDTSC instructions, mind if I respond/add to Reid's
statement about having a timer library invoke the hardware directly?

(0) Using RDTSC as an implementation layer alternative to system calls
is probably overall a good thing.

(0.1) Further, on compilers that support it, using inline assembly code for
this is even more of a good thing.

(1) However, there probably needs to be an implementation dependent
configuration parameter selecting which to use - syscall or RDTSC - for
some of the following reasons. Maybe
        #ifdef BOOST_TIMER_RDTSC
?

(1.1) Architecturally, RDTSC is defined to return a timestamp, not a time value.
On IBM 360's there were implementations of the same function that return
something like seconds or milliseconds of real time, and just incremented a counter
for the low order bits, to return a unique, ordered, value. Intel would like to have
the option of doing the same.

I wrote some of the language that tried to explain this, in the internal manuals
for P6. However, after it got watered down for the public, ... well, it is basically
a lost cause. The real problem is that there is a real need for a good reliable
low overhead timer, that need is not satisfied anywhere else, so RDTSC is used
de-factor by thousands of programmers.

Except...

(1.2) RDTSC is denominated in cycles. Now, as you may know, Intel has announced
Geyserville technology that changes the clock rate of a chip depending on whether
it is plugged in, and how fast you want the batteries to go down. So, a clock cycle
may sometimes be 2ns, and sometimes 4 or 8ns. Now, *you* probably aren't doing
performance measurements in such an environment, but some people probably
want to.

Further, some non-Intel x86 processors (TI?) actually dynamically varied the clock rate
while running.

It took some doing, but we ensured that the TSC timestamp counter does not get powered
down by the P6 core, even when it goes into a power saving mode. But while I think that
is a good thing, I am fairly confident that one day we will have a power savings mode
where TSC does change rate, or even go away.

(1.3) Further, RDTSC can be disabled by the OS. It turns out that any high resolution
timer is a security hole, for high level (higher than NT and UNIX) secure systems.
Covert timing channels. The best way to fix that is to mask off low order bits in
the hardware, but so far all we have done is just allow the OS to disable it if security
is really a concern.

(1.4) RDTSC is not guaranteed to be consistent between CPUs in a multiprocessor
system. It actually happens that it is, to within a small factor - strictly speaking a good
timing utility would estimate that factor so that values could be corrected - but, again, that's
not guaranteed.

Similarly, nobody here is likely to be using such a system, but systems that allow CPUs
to be hot-inserted and deleted again do not guarantee RDTSC synchronization,
unless the OS has taken special care.

(1.5) Finally, I have worked on OSes where the TSC was context switched with the
process - providing fine grain process time, as opposed to real time.

So just beware.

Basically, all of these caveats amount to saying "Yes, RDTSC does what you want now,
and probably will do so in the future just because the demand is high, even though such
behaviour is not guaranteed by the original or current definitions."

But please provide a way of unspecifying RDTSC if desired.

>Just a minor cavil from someone who does most of his programming on Wintel
>platforms: the C and system timer functions aren't very reliable, because
>they're at the mercy of the time-share schemes (which are different in every
>version of Windows), and don't return very reliable numbers. On Pentium and
>better Intel processors, though, the Time-stamp instructions are available,
>which directly read an _Int64 counter in the CPU that counts cycles since
>start-up or reset. The only inaccuracies in using these instructions are
>those associated with the function-call overhead, assuming you've handled
>thread-locking and such correctly. Admittedly, this only works on a subset
>of one brand of processor, but realistically it accounts for a large chunk
>of market share. I'd rather see such a version of the implementation layer
>for Pentium systems than one that uses the standard system calls.

------------------------------------------------------------------------

eGroups.com home: http://www.egroups.com/group/boost
http://www.egroups.com - Simplifying group communications


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk