From: Gregory Colvin (gregory.colvin_at_[hidden])
Date: 2003-05-08 10:32:42
My experience tuning our java VM is similar, and it runs on a lot of
different CPUs. Still, there is reason to be suspicious of very small
changes, which might be repeatable for our benchmark set, yet have no
real meaning for normal use. And there is reason to be careful not to
waste time pursuing 3% tweaks instead of going for 100% breakthroughs.
On Thursday, May 8, 2003, at 09:11 America/Denver, Darin Adler wrote:
> On Thursday, May 8, 2003, at 07:04 AM, Beman Dawes wrote:
>> A 2-3% timing difference probably isn't reliably repeatable in real
>> How code and data happens to land in hardware caches can easily swamp
>> out such a small difference. The version-to-version or step-to-step
>> differences in CPU's, memory, compilers, or operating systems can
>> cause that much difference in a given program. Differences need to
>> get up into the 20-30% range before they are likely to be reliably
>> repeatable across different systems.
>> At least that's been my experience.
> That has not been my recent experience. While working on my current
> project (the Safari web browser), we have routinely made 1% speedups
> that are measurable and have an effect across multiple machines and
> compilers (same basic CPU type and operating system), and we have also
> detected 1% slowdowns when we inadvertently introduced them.
> They add up. Ten 1% speedups result in a 9.5% speedup.
> It's true that differences in CPUs, memory, compilers, and operating
> systems can cause huge differences, but that does not mean that
> changes that make such small increases in performance are therefore
> not worthwhile.
> In our project, a 3% speed increase is considered a cause for
> I'm not sure, though, if this negates your point, Beman. Something
> that gives a 2-3% speedup for one Boost user might not be worth any
> level of obfuscation unless we can prove it provides a similar speedup
> for other Boost uses.
> -- Darin
> PS: On the occasions where you can fix an algorithm in a way that
> gives a 10x speed increase, or a 25% one, that's even more exciting.
> To give you an idea of what I'm talking about, here's a log from a
> Monday, November 18, 2002
> and reducing the number of UString allocations
> object instead of a two level abstraction
> hash table and improving String instance handling
> ObjectImp and doing less ref/unref
> strings with custom code rather than sprintf
> Tuesday, November 19, 2002
> in the property map hash table code
> the "perfect hashing" hash tables used for static properties
> the UString representation so we don't recompute them
> list each time during sorting
> Wednesday, November 20, 2002
> ref/deref done by changing interfaces so they can deal directly with
> only on demand rather than for each function call
> Thursday, November 21, 2002
> objects on the stack rather than in the garbage-collected heap
> singly-linked list that shares tails rather than a non-sharing
> doubly-linked list with subtly different semantics
> Friday, November 22, 2002
> linked list to a vector, and using a pool of instances
> Unsubscribe & other changes:
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk