Boost logo

Boost Users :

From: Michael Marcin (mmarcin_at_[hidden])
Date: 2007-06-29 16:24:47


Zeljko Vrba wrote:
> On Fri, Jun 29, 2007 at 02:05:59AM -0500, Michael Marcin wrote:
>> There is a senior engineer that I work with who believes templates are
>> slow and prefers to write C or assembly routines. The templates are slow
>>
> What does he base his belief on? And did he provide *any* proof for his
> reasoning? (Well, if he's in a higher position than you, he might not
> be required to do so. People listen to him because he's in higher
> position, not because he has good arguments. Been there, experienced that.)
>

Apparently from looking at generated assembly from template code in the
past with old compilers and probably bad programmers.

>> Write some interesting code and generate the assembly for it.
>> Analyze this assembly manually and save it off in source control.
>> When the test suite is run compile that code down to assembly again and
>> have the test suite do a simple byte comparison of the two files.
>>
> I don't understand this part. What do you want to compare? Macro vs.
> template version? This will certainly *not* yield identical object file
> (because it contains symbol names, etc. along with generated code).
>

Yes this is a little confusing. Essentially the idea was to write
snippets in both C and with templates and manually compare the generated
assembly by looking at it. Then after I'm satisfied with the results the
regenerate and compare tests would hopefully only fail if a meaningful
change was made to the library code at which point I would have to
reexamine the files by hand again. A lot of work.. especially when
multiple configurations come into play.

>> Write the templated and C versions of the algorithms.
>> Run the test suite to generate the assembly for each version.
>> Write a parser to and a heuristic to analyze the generated code of each.
>> Grade and compare each
>>
> Unfortunately, it's almost meaningless to analyze the run-time performance of a
> program (beyond algorithmic complexity) without the actual input. "Register
> usage" is a vague term, and the number of function calls does not have to
> play a role (infrequent code paths, large functions, cache effects, etc).
>

Whether it is matters or not is another question but you can look at
generated code and determine if the compiler is doing a good job.

For instance say I have:

class my_type
{
public:
     int value() const { return m_value; }
private:
     int m_value;
};

bool operator==( const my_type& lhs, const my_type& rhs )
{
     return lhs.value() == rhs.value();
}

bool test_1( my_type a, my_type b )
{
     return a == b;
}

bool test_2( int a, int b )
{
     return a == b;
}

Now if test_1 ends up calling a function for operator== or does any
pushes onto the stack its not optimal and my_type and/or its operator==
need to be fiddled with.

It's this level of straight forward code I'm concerned with at the moment.

>> Does anyone have any input/ideas/suggestions?
>>
> How about traditional profiling? Write a test-suite that feeds the same
> input to C and C++ versions and compares their run-time? Compile once
> with optimizations, other time with profiling and compare run-times and
> hot-spots shown by the profiler.

As I said before there is no reliable timing mechanism available and the
process of compiling, installing, and running programs on this target
cannot be automated AFAIK.

Thanks,

Michael Marcin


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net