Boost logo

Boost :

From: Darin Adler (darin_at_[hidden])
Date: 2003-05-08 10:11:36

On Thursday, May 8, 2003, at 07:04 AM, Beman Dawes wrote:

> A 2-3% timing difference probably isn't reliably repeatable in real
> code.
> How code and data happens to land in hardware caches can easily swamp
> out such a small difference. The version-to-version or step-to-step
> differences in CPU's, memory, compilers, or operating systems can
> cause that much difference in a given program. Differences need to get
> up into the 20-30% range before they are likely to be reliably
> repeatable across different systems.
> At least that's been my experience.

That has not been my recent experience. While working on my current
project (the Safari web browser), we have routinely made 1% speedups
that are measurable and have an effect across multiple machines and
compilers (same basic CPU type and operating system), and we have also
detected 1% slowdowns when we inadvertently introduced them.

They add up. Ten 1% speedups result in a 9.5% speedup.

It's true that differences in CPUs, memory, compilers, and operating
systems can cause huge differences, but that does not mean that changes
that make such small increases in performance are therefore not

In our project, a 3% speed increase is considered a cause for

I'm not sure, though, if this negates your point, Beman. Something that
gives a 2-3% speedup for one Boost user might not be worth any level of
obfuscation unless we can prove it provides a similar speedup for other
Boost uses.

     -- Darin

PS: On the occasions where you can fix an algorithm in a way that gives
a 10x speed increase, or a 25% one, that's even more exciting. To give
you an idea of what I'm talking about, here's a log from a week I spent
increasing the speed of our JavaScript library:

Monday, November 18, 2002
- sped up JavaScript iBench by 70% by using a better sort algorithm and
reducing the number of UString allocations
- sped up JavaScript iBench by 6% by turning ExecState into a simple
object instead of a two level abstraction
- sped up JavaScript iBench by 7% by turning the property map into a
hash table and improving String instance handling
- sped up JavaScript iBench by 6% by hoisting the property map into
ObjectImp and doing less ref/unref
- sped up JavaScript iBench by 1.5% by converting integers into strings
with custom code rather than sprintf

Tuesday, November 19, 2002
- sped up JavaScript iBench by 2% by using masking instead of modulus
in the property map hash table code
- sped up JavaScript iBench by 3% by improving the implementation of
the "perfect hashing" hash tables used for static properties
- sped up JavaScript iBench by 1.5% by storing computed hash values in
the UString representation so we don't recompute them
- sped up JavaScript iBench by 6.5% by atomizing property identifiers
- sped up JavaScript iBench by 1.5% by not clearing and rebuilding the
list each time during sorting

Wednesday, November 20, 2002
- sped up JavaScript iBench by 5% by decreasing the amount of ref/deref
done by changing interfaces so they can deal directly with ValueImp
- sped up JavaScript iBench by 1% by making lists ref/unref less
- sped up JavaScript iBench by 7.5% by creating argument list objects
only on demand rather than for each function call

Thursday, November 21, 2002
- sped up JavaScript iBench by 3% by allocating the ActivationImp
objects on the stack rather than in the garbage-collected heap
- sped up JavaScript iBench by 11% by turning the scope chain into a
singly-linked list that shares tails rather than a non-sharing
doubly-linked list with subtly different semantics

Friday, November 22, 2002
- sped up JavaScript iBench by 10% by changing the List class from a
linked list to a vector, and using a pool of instances

Boost list run by bdawes at, gregod at, cpdaniel at, john at