|
Boost : |
From: Carl Daniel (cpdaniel_at_[hidden])
Date: 2001-10-31 17:44:42
From: "Beman Dawes" <bdawes_at_[hidden]>
To: <boost_at_[hidden]>; "boost" <boost_at_[hidden]>
Sent: Wednesday, October 31, 2001 2:07 PM
Subject: Re: [boost] boost.timer
> At 10:54 AM 10/31/2001, Toon Knapen wrote:
[snip]
>
> I'm inclined to think the solution may not be to try to fix timer, but
> to specify a new timing library with much more precise semantics.
>
In a recent past life, I implemented a web-server-based image processing component. We were very interested in where
time was being spent, but it's not an environment which lends itself well to standard profilers. Sampling profiles like
Intel's VTune can be used, but in my application, wouldn't give a very useful result unless I could cook up 100's (or
1000's) of suitable requests to the web server.
Instead, I came up with an invasive profiling library which proved to be very useful. It consisted of a few basic
components, but is easier to explain by example than exposition:
in a "main"-like function, you'd write:
int main_like()
{
CONTEXT_TIMER_ROOT("Descriptive name of timer");
// code which calls arbitrarily complex code.
}
and in any function where timing was of interest, you'd write:
void some_function_that_takes_time()
{
CONTEXT_TIMER("description of timer");
// usually I just used the function name.
{
CONTEXT_TIMER("block within function");
// handy with multi-part functions (that really should be split up, but that's another story).
}
}
Behind the scenes, CONTEXT_TIMER_ROOT created an object and stored a pointer to it in thread-local-storage. Each
CONTEXT_TIMER created a local object which would measure its own lifetime, and push that information onto the tail of a
list (really a tree) in the thread-specific root timer. The root was thread specific because my component could be
entered by many web server threads simultaneously to service several requests.
When the root timer went out of scope, it pushed the tree of subroutine times into a queue, which was read by a
low-priority background thread which would write the time trees out to a log file. Unfortunately, I can't find a sample
of the log produced or I'd post one for all to see.
All told, this took about 700 lines of code for a Windows/Linux portable version. Under Windows, the timer objects used
QueryPerformanceCounter, which on many systems has CPU-clock cycle accuracy. In all cases it has at least microsecond
accuracy. Under Linux, the timer used gettimeofday, which appeared to have microsecond resolution (although admittedly
we did the vast majority of our timing under Windows).
If there's interest, I could probably clean up & post an implementation of this profiling code. I don't have the time
to do any maintenance/upgrades, though, so it'd be up to someone else to adapt it to whatever is needed today.
It's not the fanciest solution, but it proved very useful in my applicaiton.
-cd
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk