Subject: Re: [boost] [containers] Clarifications on incomplete type support?
From: Mikael Persson (mikael.s.persson_at_[hidden])
Date: 2014-09-24 19:26:13
> Do you notice this overhead in C++03 or C++0x compilers, in both modes?
Can't say really, the move from STL containers to Boost.Containers was a
while ago (approaching 2 years, maybe), so I don't really remember, but I
would assume the problem is as much of a problem in either case. In my
experience the pre-processor tricks that were common in C++03, which were
replaced by the enhancements of C++11, are not really much worse (or
better) than their C++11 equivalent. This is because what consumes a lot of
time for compilation is the instantiations of templates (not the amount of
code or the amount of overloads), and whether multiple templates are
generated with the pre-processor or whether they are generated with
variadic templates or some combination of specializations and Sfinae-style
tricks, the real problems are with the combinatorial explosion of template
Some (possibly obvious) tips:
Limit the depth of TMP recursive meta-evaluations.
Never use more than the bare minimum of template arguments in the
meta-functions or other helper templates.
Things like Sfinae can be a significant problem for compilation times,
because they require context-aware complex template instantiations to
evaluate each overload, prefer to use tag-dispatching or regular overloads.
Variadic templates are also really dangerous for compilation times. It is
very easy to create deep recursions and O(N^2) template instantiations in
cases where the C++03-style of template would have N templates of O(N)
instantiation complexity, which is much better.
> However I haven't diagnosed where the problem lies and I have no much
time and knowledge to investigate it.
Yeah. I hear you. This is a really hard problem to tackle, and it's never
very obvious what causes the most issues.
> In any case without measurements, we don't know where we waste the
> I don't know if measuring this overhead reliably is possible.
It's sort of possible.
Some time ago I did some crude tests like that on a project and it was very
tedious. I basically enabled compilation profiling on GCC and created small
cpp files that contained explicit template instantiations of selective
pieces of the library's code. It's crude and tedious, you start with
high-level (end-user) templates to see which are worse, and work your way
through the template instantiations that they trigger until you find the
templates that seem to be the worse offenders, and then try to understand
way (e.g., find the complexity (O(N)..) of the number of template
instantiations.. it's really tedious and time-consuming, but it can pay off
sometimes. A similar trick is to comment out certain features (e.g.,
certain functions or parts of the container) and see how compilation times
change (it only needs to compile, it does not need to be able run, so you
can comment out a lot of stuff without having to replace them with dummy
code or stubs).
But recently, there has been some work on a template instantiation profiler
(time and memory) and a template instantiation debugger, which is called
Templight (an instrumentation of Clang). The problem is that it is still in
a beta stage. I'm actually the one who picked up the code from the original
authors (who seem to have abandoned it) and enhanced it as well as creating
an interactive (GDB-style) debugger for stepping through template
instantiations in the same way you can step through code execution on a
regular run-time debugger. However, this project has been on my back-burner
for a while because every time I hack away at the Clang code-base I get
very frustrated and discouraged, because it is such a huge mess and a
horrible code-base to work with. Anyway, in theory, you can use Templight
on profile the time and memory consumed by the compiler for each
instantiation templates, just like a run-time profiler, but for template
instantiations instead of function calls.
I've been trying to bump up the priority of that item ("finish templight
coding") on my to-do list, and hope to get to it some time in the next few
weeks. Part of being able to get to that to-do item is wrapping up my work
on those major additions to the BGL (which is another side-project that has
been lingering for far too long).
> Which containers do you use in BGL?
I use most of them. The main chunk of code is that I entirely
re-implemented the adjacency_list class with Boost.Container on the
back-end. As you probably know, the adj-list class provides options for
choosing the type of containers used under-the-hood, and so, I implemented
it for every sensible combination of vector, list, set, multiset,
unordered_set, unordered_multiset, and "pool" (which is a vector container,
but leaves "holes" in it upon removal of elements to avoid invalidating
indices / iterators to other elements). My implementation of adj-list is
much leaner than the existing one, but it takes quite a bit more time to
compile it (at least, so it seems, I haven't measured it), and so, I now
suspect it's coming from the Boost containers. But I cannot really test
against STL containers any more, because I rely on incomplete type
guarantees and n3644 iterators ;) neither of which are implemented in
standard STL releases.
But I agree that the vector container is probably not a problem, because I
use it elsewhere, and it compiles fast (this is just my subjective
impression, of course, it's hard to measure).
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk