Boost logo

Boost :

Subject: Re: [boost] [explore] Library Proposal: Container Streaming
From: Vladimir Prus (vladimir_at_[hidden])
Date: 2009-12-02 02:06:45

David Bergman wrote:

>>>> And that code still has
>>>> to be parsed by the compiler.
>>> This is a different issue. Is that your problem, that you think compilers have a problem with
>>> some complex constructs inside the Boost.Serialization library? Does the compiler fail? Consume
>>> too much resources? Too much time?
>> Yes, too much time and resources.
> How much time? I am not asking to be cute, but just curious as to how much extra building time is
> needed for Boost.Serialization. I have used it heavily in a lot of projects, and have not been
> disturbed by it.
> When you say resources, do you mean that compiler (or linker) use a lot of memory to handle
> Boost.Serialization?
>> When building debug variant, an program that does nothing
>> but creates std::vector<int> is a 53K object file. When I add serialization, I get:
>> - a useless warning shown below, and no, I am not willing to apply const_cast or declare things
>> const just because Boost.Serialization has strange position about what const really means
> I did not follow you here; could you explain what that strange position is?

It's strange that you used Boost.Serialization in a lot of project without running into that.

Note that 1.41 appears to have removed that bit of docs, and replaced assertion with the warning
I have posted.

>> - a 387K object file
> Let me check this. I am using -O2

That's not debug build, for sure.

> on a Snow Leopard box with Apple's GCC 4.2.1, with the demo
> file. With *no* inclusion of Boost.Serialization headers nor reference to that library, I get a
> 10kB executable (basically a simple Hello World app with some vector manipulations). I append the
> source code at the end of this post for self-containment purposes :-)
> Let us introduce Boost.Serialization, and see what happens with a varying number of such
> invocation locations (log << vi) in our single demo file, compiling to an executable:
> no Boost.Serialization: 10kB
> N=0 (i.e., just including Boost.Serialization headers and linking with library): 27kB
> N=1: 51kB
> N=2: 55kB
> N=10: 55kB
> N=100: 55kB
> N=500: 65kB
> N=1000: 73kB
> N=4000: 118k
> Performing a mental linear regression, and using N=1 as a fixed constant, we get a linear function
> in extra (disk) space of
> 41kB + N * 20B
> Note that this extra space is relative code containing no Boost references at all.
> So, basically 20 bytes for each such invocation location. If you think that is strenuous, you
> could always wrap the invocation in a function (and force it to not be inlined), in which case you
> get a constant space addition of 41kB.

First, I explicitly did not measure per-invocation overhead, as it does not seem to be of
much importance for a specific application. I don't plan to have debug output all over.
Second, it's not clear to me why you think that wrapping invocation in a function will entirely
remove the 20B per call overhead. You need some instructions to call a function.
Third, I explicitly measured debug build because it is the variant whose size is most affected by
dependencies, and the size of debug builds obviously matters, as anybody who ever used the GDB
debugger can attest.
Fourth, I explicitly did not measure binary size. If we go that route, we'll have to factor
in the size of And if we do, we'd have to count relocations that it
contains and effect of those relocations on application startup time..

> NOTA BENE: more than a few hundred invocation locations in a project seems simply weird.
>> When doing a release build, the source with serialization takes 1.3 second to compile, while
>> the empty one takes 0.3 second.
> Here you have a better case, methinks. Let us do the similar tests as above, but now for
> compilation+linking time (linking time is much lower than compilation time...), where I measure
> the user + system time spent, which should be fairer for extrapolations to bigger builds (and my
> concurrent threads running on my box...):
> no Boost.Serialization: 0.45s
> N=0: 1.59s
> N=1: 1.59s
> N=2: 1.62s
> N=10: 1.62s
> N=100: 1.85s
> N=500: 2.68s
> N=1000: 4.02s
> N=4000: 13.52s
> Again fixating the constant at N=1, and performing some mental linear regression again, we get a
> linear function in extra build time of
> 1.14s + N * 3.4ms
> The biggie here is of course the constant, of an extra 1.14s, as you also reported. But note that
> it is relative *no* Boost use at all.
> When I wrap the output in the template function 'output' shown in the bottom of this post, and
> force that function not to be inlined (actually using a function pointer), I get
> N=1: 1.64s
> N=1000: 2.11s
> N=4000: 4.15s
> Using mental linear regression again, we get:
> 1.19s + N * 1.2ms
> NOTA BENE: I used no precompiled headers or such "cheating" which is the norm for bigger projects
> :-)
>> It does not seem like this overhead is justifiable for a trivial task of just getting my
>> std::vector printed.
> I agree, after measuring and being somewhat surprised by that large constant. But, again, it is
> relative no Boost use at all, and no precompiled headers or other compiler/linker tricks. In my
> regular code, the addition of Boost.Serialization does not add much compilation time. After all,
> it is just a few ms per invocation location.

I guess the above discussion has clearly demonstrated one thing. Namely, there's no set of
established guidelines how to measure the cost of a dependency library -- especially a highly
templated library. There are different option regarding:

- What to measure: size or compilation time, or startup time, or debugger startup time
- What build variant(s) to use
- Should overhead for first use, or for each successive invocation be measured
- What is the baseline? Empty file? File not using Boost?
- If we have the numbers of everything above, what to make of them.

Definitely, case studies regarding specific libraries might give interesting results.

>> ../../../boost/mpl/print.hpp: In instantiation of
>> ‘boost::mpl::print<boost::serialization::STATIC_WARNING_LINE<98> >’:
>> ../../../boost/serialization/static_warning.hpp:92: instantiated from
>> ‘boost::serialization::static_warning_test<false, 98>’
>> ../../../boost/archive/detail/check.hpp:98: instantiated from ‘void
>> boost::archive::detail::check_object_tracking() [with T = std::vector<int, std::allocator<int>
>> >]’
>> ../../../boost/archive/detail/oserializer.hpp:313: instantiated from ‘static void
>> boost::archive::detail::save_non_pointer_type<Archive>::invoke(Archive&, T&) [with T =
>> std::vector<int, std::allocator<int> >, Archive = boost::archive::text_oarchive]’
>> ../../../boost/archive/detail/oserializer.hpp:525: instantiated from ‘void
>> boost::archive::save(Archive&, T&) [with Archive = boost::archive::text_oarchive, T =
>> std::vector<int, std::allocator<int> >]’
>> ../../../boost/archive/detail/common_oarchive.hpp:69: instantiated from ‘void
>> boost::archive::detail::common_oarchive<Archive>::save_override(T&, int) [with T =
>> std::vector<int, std::allocator<int> >, Archive = boost::archive::text_oarchive]’
>> ../../../boost/archive/basic_text_oarchive.hpp:80: instantiated from ‘void
>> boost::archive::basic_text_oarchive<Archive>::save_override(T&, int) [with T = std::vector<int,
>> std::allocator<int> >, Archive = boost::archive::text_oarchive]’
>> ../../../boost/archive/detail/interface_oarchive.hpp:64: instantiated from ‘Archive&
>> boost::archive::detail::interface_oarchive<Archive>::operator<<(T&) [with T = std::vector<int,
>> std::allocator<int> >, Archive = boost::archive::text_oarchive]’
>> d.cpp:13: instantiated from here
>> ../../../boost/mpl/print.hpp:55: warning: comparison between signed and unsigned integer
>> expressions <d.cpp><d_empty.cpp>_______________________________________________
>> Unsubscribe & other changes:
> I have not seen this before.

Presumably, because in 1.40 this would have been a static assertion.

- Volodya

Boost list run by bdawes at, gregod at, cpdaniel at, john at