Boost logo

Boost :

From: Joerg Walter (jhr.walter_at_[hidden])
Date: 2002-06-25 14:03:46


----- Original Message -----
From: "Benedikt Weber" <weber_at_[hidden]>
Newsgroups: gmane.comp.lib.boost.devel
To: <boost_at_[hidden]>
Sent: Tuesday, June 25, 2002 1:23 PM
Subject: [boost] uBLAS performance problem with CodeWarrior 8

> I compiled uBLAS with Metrowerks CodeWarrior 8 on Windows 2000. Once I got
> the include paths right (avoiding exception.h and functional.h from MSL
> being included instead from uBLAS), all tests compiled without errors.
(Well
> the new ISO template parser found a lot missing typename's in the testing
> code and some undefined matrices called "m3" [should be "m"] which are not
> actually instantiated.)

Would you please be so kind to provide us with your changes? We don't have
access to any Code Warrior version (and are always interested to improve the
conformance of uBLAS ;-)

> Unfortuntely performance is unexpectedly low. Just taking the output from
> bench1 (scale 10, Pentium 3, 933 Mz), 100x100 double matrix multiply (last
> lines of output):
>
> CW8:
> prod (matrix, matrix)
> C array
> elapsed: 0.14 s, 406.674 Mflops
> c_matrix safe
> elapsed: 3.435 s, 16.5748 Mflops
> c_matrix fast
> elapsed: 3.405 s, 16.7208 Mflops
> matrix<unbounded_array> safe
> elapsed: 6.509 s, 8.74702 Mflops
> matrix<unbounded_array> fast
> elapsed: 6.58 s, 8.65264 Mflops
> matrix<std::vector> safe
> elapsed: 12.197 s, 4.6679 Mflops
> matrix<std::vector> fast
> elapsed: 12.168 s, 4.67902 Mflops

Please check, whether the preprocessor symbol NDEBUG is defined. NDEBUG in
turn defines NUMERICS_USE_ET, which enables uBLAS release mode. uBLAS
distinguishes debug mode (size conformance checks enabled, expression
templates disabled) and release mode (size conformance checks disabled,
expression templates enabled).

> compare this with MSVC6:
> bench_3
> prod (matrix, matrix)
> C array
> elapsed: 0.311 s, 183.069 Mflops
> c_matrix safe
> elapsed: 0.19 s, 299.655 Mflops
> c_matrix fast
> elapsed: 0.17 s, 334.908 Mflops
> matrix<unbounded_array> safe
> elapsed: 0.171 s, 332.949 Mflops
> matrix<unbounded_array> fast
> elapsed: 0.17 s, 334.908 Mflops
> matrix<std::vector> safe
> elapsed: 0.21 s, 271.116 Mflops
> matrix<std::vector> fast
> elapsed: 0.171 s, 332.949 Mflops
>
> I tried several settings (inline depth of 8, no auto inline, target
> processor Pentium 3) with all basically the same performance. I have no
idea
> where to look next. I don't have any experience with this version of the
> Metrowerks compiler since it just came out, and I did not do any timing
with
> older versions either. I just wonder what the performance is with other
> compilers. It probably depends much on the ability to inline code.

Correct.

Regards

Joerg


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk