From: John Maddock (john_at_[hidden])
Date: 2007-08-12 05:01:52
Lassi Tuura wrote:
>> It's not the first time I've seen this, and I still do not understand
>> what the belief of template meta-programming causing bloat is based
> Maybe I can help with that. For simple toys and well-contained
> programs, the compiler can do a lot to inline, as was kindly shown by
> someone. But take an application of ours as an example from somewhat
> more real software world.
> An average run pulls in several hundred shared libraries and has a
> memory foot print of about 500 MB. Of the 500 MB, ~100 MB is for
> text segments (machine code). Of the 100 MB, several tens of
> megabytes is redundant duplicate symbols, for example template
> instantiations and out-of-line versions of inline functions. There's
> 1 MB of code for (numerous copies of) a single template function.
> For a more real measure of redundancy, you'd have to add on top code
> that was actually inlined.
> Put another way, we estimate 10-25% of the entire program memory
> footprint is various degrees of _code_ redundancy.
Right: but here's the thing, had those shared libraries been written in C or
FORTRAN or some other language without templates there would still be code
redundancy, but they would have been implemented with cut-and-paste or
(shudder) macros. This is far more fragile than reusing templates -
especially heavily tested Boost or std ones.
To give you a real world example: I recently had cause to dig into a math
library written in a mixture of C and FORTRAN. In four different places the
same identical code cropped up (a cut and paste job), had someone spotted a
bug in that code in one of those places would they updated all of them?
Maybe, maybe not. In contrast the Boost Math Toolkit implements similar
functionality as a template. One definition, easier to maintain, test etc.
I not trying to make a "mines better that yours" argument here, only that
properly used C++ is far easier to write and maintain than languages that
Of course if that template is used with multiple distributions you get
multiple instances. Is this really code bloat? Or is it simply that
C/FORTAN hide their bloat better?
As others have said already - template meta-programming is a different beast
altogether - it generates neither code nor data - and can often be used to
reduce code bloat by directing equivalent code to the same actual template
instance. Of course whether this actually happens is a question of code
> But it gets worse. The "useful" and "useless redundant code"
> interleave in the shared libraries. For each duplicated symbol the
> dynamic linker picks a "representative" symbol; the others will never
> be used. You won't know which ones the representative ones will be
> until at run time. The result: in memory code is like Swiss cheese,
> some hundred bytes of used code, some hundred bytes of unused code,
> used code, unused code. Not only do we have lots of memory committed
> to code, we make terrible use of it and have poor page locality.
That's not good, especially for those of us used to the VC++ linker that
discards duplicate symbols at link time :-)
> And surprise, some of the bigger bottlenecks plaguing our programs
> are L1 instruction cache pressure and ITLB misses. 40% of L2 cache
> accesses are for instructions -- and each and every L1 instruction
> cache miss stalls the CPU. Solving the code bloat is obviously
> making its way up on our performance optimisation priorities.
> Haphazard use of templates and inlining definitely make things
> worse. Hope this helps understand why.
Honestly, I believe this bloat issue is true of any language.
There is however, one downside to templates I'd admit to :-) They make it
too easy to generate code in an ad-hoc fashion without thought of the
The difference is that C++ lets you spot the duplication: for example if you
discover that your application is using vector<int>, vector<unsigned>,
vector<long> and vector<heaven-knows-what-else> then you should definately
be asking why they can't all be using the same instantiation!
> While we all hope compiler technology to improve, the reality check
> is that the impact factor of 20 years of compiler research is
> trivially measured: it's the difference between -O0 vs. -O3. If your
> program falls into the class where that makes a world of difference,
> good for you. Most programs in this world are not in that class, and
> you start engineering and making trade-offs.
You don't say which compiler you're using (gcc?), but there are many
commercial compilers that will automatically spot genuine duplication and
remove duplicates from the linked executable (shared libraries always make
this harder though I admit). After that, I'm afraid it's down to good code
management: taking care that your developers don't get "instantiation
happy", and do coordinate which template instances they use between them.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk