Boost logo

Boost :

Subject: Re: [boost] [Booster] Or boost is useless for library developers
From: Lassi Tuura (lat_at_[hidden])
Date: 2010-05-20 14:22:28


Hi,

> If there are any big offenders that are Boost libraries then lets hear about it: I'm sure we can take it ;-)

Let's see, let me start with these. Need out-of-line versions of repeat instantiations everywhere:

 - boost::noncopyable_::noncopyable -> need out-of-line typeinfo
 - boost::detail::sp_counted_base::destroy()
 - boost::detail::sp_counted_base -> need out-of-line dtor, vtable, typeinfo
 - boost::detail::shared_count -> need out-of-line dtor, vtable, typeinfo
 - boost::system::system_error -> need out-of-line dtor, vtable, typeinfo
 - boost::system::error_category -> need out-of-line dtor, vtable, typeinfo
 - boost::system::error_category::equivalent()
 - boost::system::error_category::default_error_condition()
 - boost::system::error_code::unspecified_bool_true()
 - boost::system::system_error::what() const
 - boost::exception_detail::clone_base -> need out-of-line dtor, vtable, typeinfo
 - boost::thread_exception -> need out-of-line dtor, vtable, typeinfo
 - boost::lock_error -> need out-of-line dtor, vtable, typeinfo
 - boost::thread_resource_error -> need out-of-line dtor, vtable, typeinfo
 - boost::bad_function_call::bad_function_call()

Need to eliminate objects generated into every source file

 - boost::system::system_category
 - boost::system::posix_category
 - boost::system::native_ecat
 - boost::system::generic_category
 - boost::system::errno_ecat
 - boost::tuples::ignore
 - boost::lambda::detail::(anonymous namespace)::constant_null_type

If your reaction is "surely those will be generated inline / elided by compiler", well, nope they didn't in our code.

> Ouch, that's one big application.

Actually that was a small one :-) It only pulls in ~600 DSO. We have 3000.

>> A significant fraction of those weak symbols represent template duplication across libraries, but that's not the only form of bloat we see. There are 2'599 symbols with at least 10 duplicates, total 5'832'419 bytes, and 118 vtables with at least 10 duplicates (about 300k).
>
> Are you able to use separate file template instantiation to reduce duplication?

This is duplication across shared libraries, not within a single library. I've tried to push us to build a handful of large libraries instead of 3000 mostly small ones, but given other constraints that will be a multi-year project. (Not your fault of course.)

>> So over half of the symbols and about a third of the size are ill-advicedly generated inline functions, virtual function tables (19'043 vtables = 2'802'928 bytes) and type info objects and names (45'961 typeinfo objs + names = 3'142'851 bytes). This goes with accompanying symbol tables, PLTs, unwind tables, and so on.
>>
>> A significant fraction of the 60+ MB symbol tables is obviously for long mangled names.
>
> That is one big problem with templates that I'll freely admit to. Compiler vendors are certainly aware of this, and some of them have reduced mangled name size over the years, but I guess it'll always be an issue :-(

That one way to look at it, yes, and I look forward to seeing the fruits of that.

I personally just tend to encourage developers to understand what compilers in the real world do, and adapt code to that, and set conventions to match. It's of course a moving target. But you cannot just splat all the code into header and hope the compiler does something smart with that, it will just not produce good results in a big system.

Same of course applies to any bad idea, like returning vector<int> by value, having a std::map inside for loop, or having code for std::map<..., std::map<..., std::vector<SomethingComplex>>> inside some object that has compiler-generated copy ctor, with everything declared inline. That's just a recipe to generate 1MB machine code and a memory allocation storm whenever someone decides to make a copy.

In general I find people are over-optimistic about what the compiler will really do to your code.

I'll be first to say there are projects where the above don't matter. But there are projects where it does.

Regards,
Lassi


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk