Boost logo

Boost :

Subject: Re: [boost] [Booster] Or boost is useless for library developers
From: John Maddock (boost.regex_at_[hidden])
Date: 2010-05-20 13:57:59


>>> - Inline functions is best way to improve performance.
>> I've found this to be true in my own work. So have lots of other people.
>> Prove us wrong.
>
> Let me have a go at describing some bloat. It's not in my interest to
> prove anything. I'll just tell you cost we see in real-life applications.

Thanks for providing some real numbers - much better than endless
speculation!

If there are any big offenders that are Boost libraries then lets hear about
it: I'm sure we can take it ;-)

> On Linux, GCC 4.5.x, x86_64, we have executables which load 598 DSO images
> mapped in 611 memory regions, corresponding to:
>
> - 263'431'108 100.0% bytes mapped from shared libraries
> - 261'638'634 99.3% bytes total allocated to sections
> - 1'792'474 0.7% bytes padding not in any section (= rounding to page)
>
> The break down by sections that are actually loaded into memory:
>
> - 114'086'965 43.3% code (.text)

Ouch, that's one big application.

> - 65'205'187 24.8% dynamic symbols and related tables
> - 26'825'486 10.2% unwind tables
> - 24'356'624 9.2% plt + relocations + related tables
> - 19'000'712 7.2% global data
> - 11'325'928 4.3% global common data (.bss)
> - 749'576 0.3% various shared library headers
> - 82'138 0.0% global constructors and destructors
> - 5'890 0.0% glibc memory management voodoo
> - 128 0.0% thread-specific data
>
> That's ~55% "real stuff", ~25% of symbol tables, ~10% unwind tables, ~10%
> relocations and PLT. The application virtual memory size is about a
> gigabyte, so this is a major fraction of the overall footprint.
>
> There are 544'533 symbols which represent 142'548'100 bytes. Of this there
> are 272'190 weak symbols, or 43'565'063 bytes.
>
> A significant fraction of those weak symbols represent template
> duplication across libraries, but that's not the only form of bloat we
> see. There are 2'599 symbols with at least 10 duplicates, total 5'832'419
> bytes, and 118 vtables with at least 10 duplicates (about 300k).

Are you able to use separate file template instantiation to reduce
duplication?

> So over half of the symbols and about a third of the size are
> ill-advicedly generated inline functions, virtual function tables (19'043
> vtables = 2'802'928 bytes) and type info objects and names (45'961
> typeinfo objs + names = 3'142'851 bytes). This goes with accompanying
> symbol tables, PLTs, unwind tables, and so on.
>
> A significant fraction of the 60+ MB symbol tables is obviously for long
> mangled names.

That is one big problem with templates that I'll freely admit to. Compiler
vendors are certainly aware of this, and some of them have reduced mangled
name size over the years, but I guess it'll always be an issue :-(

Regards, John.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk