Boost logo

Boost :

Subject: [boost] Template Instantiation Profiler with Callgrind Output (KCacheGrind)
From: Mikael Persson (mikael.s.persson_at_[hidden])
Date: 2015-01-25 20:33:42

Hi boost developers and users,

I just wanted to announce a recent development in my little pet-project,
which is Templight: a template instantiation profiling tool based on Clang.
I think that many of you would be interested in experimenting with it
because many boost libraries are great examples of template-heavy libraries
and many maintainers, developers and users might be interested in gathering
data and diagnosing the real compilation costs (in time and memory) of
various components of Boost.

Templight is a Clang-based profiler that can act as a drop-in replacement
(e.g., in cmake) for the clang compiler, with the addition of generating a
complete trace of the template instantiation history with associated time
and memory (optional) costs. This allows you to see not only how deeply
recursive certain template instantiations are, but also identify which
instantiations the compiler spends the most time or memory on. It is
basically to meta-programming what a run-time profiler is to ordinary code.

The templight profiler itself has reached a fairly stable state. It works
at least under Linux and Windows (through "templight-cl.exe", which is a
drop-in replacement for "clang-cl.exe", which is compatible with MSVC
"cl.exe"), but probably works wherever clang works. Which means that you
are welcome to try it. I have also used it successfully to build a few
cmake-based projects (by setting CC and CXX environment variables to point
to templight). Templight is currently also used in the back-end of the
"metashell" project as well (an interactive shell interpreter and debugger
for C++ template meta-programs). You can build the templight profiler
against the clang source (with a patch + some added code), as it instructs
on the github readme:

But most recently, I have created a separate set of tools to convert
templight trace files to other formats. The most awesome of these outputs
is the "callgrind" output, which produces a "meta-"call-graph, that is
similar to (and in the same format as) call-graphs generated by run-time
profiling tools like callgrind, but the profiling data is, of course,
related to the compilation costs (time / memory). That conversion tool is
available here:

With that, you can open the converted templight trace files in the
visualization tool KCacheGrind, which is great (works both on unix-like
systems and Windows, thanks to Qt5). Here is a screenshot of that for one
of Boost.Container's example programs:

I really hope that some of you will give this a try. I'm really hoping that
it will be useful to you, and that you can provide feedback and feature
requests. I couldn't think of any better target audience for this tool.



Sven Mikael Persson, M.Sc.(Tech.)
PhD Candidate and Vanier CGS Scholar,
Department of Mechanical Engineering,
McGill University,

Boost list run by bdawes at, gregod at, cpdaniel at, john at