Boost logo

Boost :

From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2008-05-09 14:18:50


Boris Gubenko wrote:
>> Steven Watanabe wrote:
>>
>>> Ok. It would be great if compilers supported this directly.
>>>
>>
>> Just fyi: the cxx compiler has verbose template instantiation
>> mode. On Tru64, for example:
>>
Steven wrote:
>I only see function template instantiations, I'm more interested in
>class template instantiations because that's where the metaprogramming
>is done. Am I missing something?
>
>I'd also like to have all template instantiations, not just those
>that are triggered from inside other template instantiations.
>(Although this doesn't make a huge difference)

This is the icc documentation for the -prof-gen flag. Apparently it
instruments the code for every basic block to enable profile guided
optimization later on. This should include template instantiates and
basic blocks from inlined functions.

-----------------------------------------
prof-gen, Qprof-gen
Instruments a program for profiling.
IDE Equivalent
Windows: General > PGO Phase
Architectures
IA-32 architecture, Intel(r) 64 architecture, IA-64 architecture
Syntax
Linux and Mac OS X: -prof-gen
-prof-genx
Windows: /Qprof-gen
/Qprof-genx
Arguments
None
Default
OFF Programs are not instrumented for profiling.
Description
This option instruments a program for profiling to get the execution
count of each basic block. It also creates a new static profile
information file (.spi).
If -prof-genx or /Qprof-genx is specified, extra information (source
position) is gathered for code-coverage tools. If you do not use a
code-coverage tool, this option may slow parallel compile times.
If you are doing a parallel make, this option will not affect it.
These options are used in phase 1 of the Profile Guided Optimizer (PGO)
to instruct the compiler to produce instrumented code in your object
files in preparation for instrumented execution.
------------------------------------------

Later on you would use -prof-gen-sampling to, among other things, create
a map from object code to line number in the source code. This map
should be a superset of data you are looking for, which is instantiation
count for templates. For templates that don't end up with any object
code (meta-functions) I think you would find no instantiations in the
map, whereas you might still find that the compiler evaluated the
meta-function many times with your warning based approach. I guess it
depends on what you are looking for.

------------------------------------------------------
prof-gen-sampling, Qprof-gen-sampling

Prepares application executables for hardware profiling (sampling) and
causes the compiler to generate source code mapping information.
IDE Equivalent

None
Architectures

IA-32 architecture
Syntax
Linux and Mac OS X: -prof-gen-sampling
Windows: /Qprof-gen-sampling
Arguments

None
Default
OFF Application executables are not prepared for hardware profiling
and the compiler does not generate source code mapping information.
Description

This option prepares application executables for hardware profiling
(sampling) and causes the compiler to generate source code mapping
information.

The application executables are prepared for hardware profiling by using
the profrun utility followed by a recompilation with option -prof-use
(Linux and Mac OS X) or /Qprof-use (Windows). This causes the compiler
to look for and use the hardware profiling information written by
profrun (by default, into a file called pgopti.hpi).

This option also causes the compiler to generate the information
necessary to map hardware profile sample data to specific source code
lines, so it can be used for optimization in a later compilation. The
compiler generates both a line number and a column number table in the
debug symbol table.

This process can be used, for example, to collect cache miss information
for use by option ssp on a later compilation.
Alternate Options

None
See Also

prof-use, Qprof-use compiler options

ssp, Qssp compiler options
----------------------------------------------

My own interest is that I would like to do performance profiling and
tuning of template instantiated code with VTune. It looks like these
compiler options in icc are more well suited to that than the static
information you are looking for.

Is there a way to write a meta-function that implements a counter?

The idea is that each time a template is instantiated by the compiler a
meta-function counter (inserted by a script similar to your warning)
would be evaluated. Then you could collect the counts computed at
compile time and print them to file at runtime along with the assocated
type name information from the counter's template parameter.

template <typename T>
struct meta_counter {...};

template < something>
struct my_template {
        //inserted by script
        meta_counter<typeof_this_template>::increment_somehow;
        ...
};

I'm not sure how retrieving the count would work. Perhaps you can
create a global const with set to the value and just use nm to retrive
the value.

I have no idea how to fully implement this right now, but perhaps Steven
can run with the idea.

Thanks,
Luke


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk