|
Boost : |
From: Josh Napoli (jnapoli_at_[hidden])
Date: 2006-01-06 10:03:06
If you need more accuracy for modern x86 CPUs, it is relatively easy to
make an implementation of timer that uses the performance counter.
Here is how to read the timer and get the period in Windows:
period()
{
DWORD BufSize = _MAX_PATH;
DWORD dwMHz = _MAX_PATH;
HKEY hKey;
// open the key where the proc speed is hidden:
//long lError =
RegOpenKeyEx(HKEY_LOCAL_MACHINE,
LPCTSTR("HARDWARE\\DESCRIPTION\\System\\CentralProcessor\\0"),
0,
KEY_READ,
&hKey);
// query the key:
RegQueryValueEx(hKey, LPCTSTR("~MHz"), NULL, NULL, (LPBYTE)
&dwMHz, &BufSize);
value_ = 1./(dwMHz*1000000);
}
};
__forceinline
static ULARGE_INTEGER readTimer()
{
#ifdef _WIN64
return __rdtsc();
#else
ULARGE_INTEGER retval;
unsigned int LowPart;
unsigned int HighPart;
__asm{
xor eax, eax
xor ebx, ebx
xor ecx, ecx
xor edx, edx
_emit 0x0f // RDTSC
_emit 0x31
mov [LowPart], eax // TICK COUNTER STARTS HERE
mov [HighPart], edx // TICK COUNTER STARTS HERE
}
retval.LowPart = LowPart;
retval.HighPart = HighPart;
return retval;
#endif
}
-----Original Message-----
From: boost-bounces_at_[hidden]
[mailto:boost-bounces_at_[hidden]] On Behalf Of Howard Hinnant
Sent: Tuesday, January 03, 2006 10:19 PM
To: boost_at_[hidden]
Subject: Re: [boost] Performance test and memory usage
On Jan 3, 2006, at 5:12 PM, axter wrote:
> What boost libraries are available for running code performance test
There is timer:
http://www.boost.org/libs/timer/timer.htm
which is very handy, but in my experience not a complete solution for
performance testing on modern machines because of the variability of
the results. I've had good experience with the following timer driver:
template <class F>
float
test(F f)
{
std::vector<float> t;
int i;
for (i = 0; i < 10; ++i)
t.push_back(f());
double total_time = std::accumulate(t.begin(), t.end(), 0.0);
while (i < 1000 && total_time < 1.0)
{
t.push_back(f());
total_time += t.back();
++i;
}
std::sort(t.begin(), t.end());
t.resize(t.size() * 9 / 10);
return (float)(std::accumulate(t.begin(), t.end(), 0.0) / t.size
());
}
Which can be used like:
float
time_something()
{
// whatever set up needs to happen
clock_t t = clock(); // or boost::timer, or whatever
{
// time this
} // scope to time any destructors as appropriate
t = clock() - t; // or whatever
// whatever tear down needs to happen
return (float)(t/(double)CLOCKS_PER_SEC); return seconds (or
whatever) as a float
}
int main()
{
cout << test(time_something) << '\n';
}
The basic idea of "test" is to call the function to be timed many
times (say up to 1000), throw away the slowest 10% of those times,
and average the rest. Why? <shrug> shoot-from-the-hip "theory" and
experience that its return is fairly consistent to 2, maybe 3
digits. The test function has a "time out" feature where it quits if
the accumulated measured time grows beyond 1 second (which may not be
appropriate for all tests). But it is easy to get bored when you
unexpectedly find yourself waiting 30 minutes for timing result, so
that's why it's there. Otoh, I put in a minimum repetition of at
least 10, no matter how long the measured time is, so you get at
least some accuracy (tweak knobs as you like). Note that the
accumulation/averaging happens in double, even though the data and
answer are float (just paranoia really). Weakness: If you're timing
something that is quicker than the minimum resolution of your timer,
this doesn't work. But otherwise, this is better than the
traditional loop inside the timer as it throws away those results
that happen to get interrupted by your email checker running. :-)
Anyway, if it is useful to you, use it. If not, no sweat.
> and
> memory usage (if any)?
If you limit it to new/delete (neglecting malloc), the following
would be easy to adapt:
http://groups.google.com/group/codewarrior.mac/browse_frm/thread/
efb176352c161b08/0faec33468b1d6bd?lnk=st&q=insubject%3Aworld's
+insubject%3Asimplest+author%3Ahinnant&rnum=2&hl=en#0faec33468b1d6bd
(some day I'm going to have to start using tinyurl (or whatever it is)).
The above is just an overloaded new using hash_map<void*, mem_info,
malloc_allocator<appropriate pair> >. (map could just as easily be
used).
You could collect statistics from the container, and from mem_info.
The malloc_alloc is necessary so that the container itself doesn't
pollute your measurement.
-Howard
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk