Boost logo

Boost Users :

From: Eloi Gaudry (eloi.gaudry_at_[hidden])
Date: 2008-03-18 15:35:33


Hi there,

I guess this topic isn't very popular: this is a compiler that few
people in the community (want to) use, but I think that the above
details numbers (from ublas/bench2 tests) are quite interesting for
those who are willing to use AIX/Power5 platforms.

Here is are a brief view of the benches we ran (the following output
come from the prod(matrix, vector) benches from the ublas/bench2 tests):
A/ were performed on a p505 IBM server (one dual-core p5+ processor
running at 1.9GHz) with either xlC(v-9) or g++(v-4.2) ;
B/ were performed on a Linux 64-bits platform (one core2 dual-core
processor running at 2.4GHz) with g++(v-4.2).

Briefly, and especially for coordinates_matrix and compressed_matrix:
- the GCC Linux 64-bits platform outperforms the p5Server platform
running AIX, either using g++ or xlC binaries ;
- the GCC AIX platform is much more efficient than the VisualAge one.

A/1) xlC using the following command line: "xlC_r -v -qphsinfo
-qalign=full -qstrict -qinline -qeh -qrtti -O2 -qxflag=erratadce -q64
-qarch=pwr5 -qtune=pwr5 -qmaxmem=-1 -qtemplateregistry -DFC_LINK
-DBOOST_UBLAS_ENABLE_PROXY_SHORTCUTS -DBOOST_LIB_DIAGNOSTIC
-DBOOST_DISABLE_THREADS -DBOOST_ALL_NO_LIB -DNDEBUG -q64"

bench_2
outer_prod
C array
elapsed: 0.009996 s, 515.19 Mflops
compressed_matrix, compressed_vector safe
elapsed: 1.22936 s, 4.18904 Mflops
compressed_matrix, compressed_vector fast
elapsed: 0.959484 s, 5.3673 Mflops
coordinate_matrix, coordinate_vector safe
elapsed: 1.34932 s, 3.81663 Mflops
coordinate_matrix, coordinate_vector fast
elapsed: 1.03945 s, 4.9544 Mflops
prod (matrix, vector)
C array
elapsed: 0 s, INF Mflops
compressed_matrix, compressed_vector safe
elapsed: 0.609609 s, 7.03981 Mflops
compressed_matrix, compressed_vector fast
elapsed: 0.419836 s, 10.2219 Mflops
coordinate_matrix, coordinate_vector safe
elapsed: 1.00945 s, 4.25137 Mflops
coordinate_matrix, coordinate_vector fast
elapsed: 0.799695 s, 5.36646 Mflops
matrix + matrix
C array
elapsed: 0 s, INF Mflops
compressed_matrix safe
elapsed: 1.96893 s, 2.61556 Mflops
compressed_matrix fast
elapsed: 1.64907 s, 3.12287 Mflops
coordinate_matrix safe
elapsed: 3.4183 s, 1.50655 Mflops
coordinate_matrix fast
elapsed: 3.06852 s, 1.67828 Mflops

2) g++ using the following command line: "g++ -pthread -maix64 -Wall
-Wno-inline -ftemplate-depth-100 -finline-functions -O1 -DAdd_
-DMALLOC_RET_VOID=1 -DUSE_STDARG=1 -DHAVE_STDARG_H=1 -DHAVE_UNISTD_H=1
-DHAVE_STRING_H=1 -DHAVE_STDLIB_H=1 -DUSE_STDARG -DNOMINMAX
-DBOOST_UBLAS_ENABLE_PROXY_SHORTCUTS -DBOOST_LIB_DIAGNOSTIC
-DBOOST_DISABLE_THREADS -DBOOST_ALL_NO_LIB -DNDEBUG -mcpu=power5"

bench_2
outer_prod
C array
elapsed: 0.009887 s, 520.87 Mflops
compressed_matrix, compressed_vector safe
elapsed: 0.58977 s, 8.73195 Mflops
compressed_matrix, compressed_vector fast
elapsed: 0.289889 s, 17.7649 Mflops
coordinate_matrix, coordinate_vector safe
elapsed: 0.509802 s, 10.1016 Mflops
coordinate_matrix, coordinate_vector fast
elapsed: 0.209922 s, 24.5322 Mflops
prod (matrix, vector)
C array
elapsed: 0.009996 s, 429.325 Mflops
compressed_matrix, compressed_vector safe
elapsed: 0.279718 s, 15.3424 Mflops
compressed_matrix, compressed_vector fast
elapsed: 0.129946 s, 33.0255 Mflops
coordinate_matrix, coordinate_vector safe
elapsed: 0.439832 s, 9.75721 Mflops
coordinate_matrix, coordinate_vector fast
elapsed: 0.279893 s, 15.3328 Mflops
matrix + matrix
C array
elapsed: 0.009998 s, 515.087 Mflops
compressed_matrix safe
elapsed: 1.22945 s, 4.18874 Mflops
compressed_matrix fast
elapsed: 0.909645 s, 5.66137 Mflops
coordinate_matrix safe
elapsed: 1.5694 s, 3.28142 Mflops
coordinate_matrix fast
elapsed: 1.26953 s, 4.0565 Mflops
 

B/ g++ using the following command line: "g++ -pthread -Wall -Wno-inline
-ftemplate-depth-100 -finline-functions -O1 -DAdd_ -DMALLOC_RET_VOID=1
-DUSE_STDARG=1 -DHAVE_STDARG_H=1 -DHAVE_UNISTD_H=1 -DHAVE_STRING_H=1
-DHAVE_STDLIB_H=1 -DUSE_STDARG -DNOMINMAX
-DBOOST_UBLAS_ENABLE_PROXY_SHORTCUTS -DBOOST_LIB_DIAGNOSTIC
-DBOOST_DISABLE_THREADS -DBOOST_ALL_NO_LIB -DNDEBUG"

bench_2
outer_prod
C array
elapsed: 0 s, inf Mflops
compressed_matrix, compressed_vector safe
elapsed: 0.19 s, 27.1044 Mflops
compressed_matrix, compressed_vector fast
elapsed: 0.15 s, 34.3323 Mflops
coordinate_matrix, coordinate_vector safe
elapsed: 0.17 s, 30.2932 Mflops
coordinate_matrix, coordinate_vector fast
elapsed: 0.12 s, 42.9153 Mflops
prod (matrix, vector)
C array
elapsed: 0.01 s, 429.153 Mflops
compressed_matrix, compressed_vector safe
elapsed: 0.09 s, 47.6837 Mflops
compressed_matrix, compressed_vector fast
elapsed: 0.07 s, 61.3076 Mflops
coordinate_matrix, coordinate_vector safe
elapsed: 0.17 s, 25.2443 Mflops
coordinate_matrix, coordinate_vector fast
elapsed: 0.14 s, 30.6538 Mflops
matrix + matrix
C array
elapsed: 0 s, inf Mflops
compressed_matrix safe
elapsed: 0.53 s, 9.71668 Mflops
compressed_matrix fast
elapsed: 0.48 s, 10.7288 Mflops
coordinate_matrix safe
elapsed: 0.71 s, 7.2533 Mflops
coordinate_matrix fast
elapsed: 0.65 s, 7.92283 Mflops

I'll appreciate any feedback on this topic, any explanation for these
results.
Thanks,

Eloi

Eloi Gaudry wrote:
> I'd like to get your point of view on the following topic: runtime
> performance of binaries built against the IBM VisualAge c++ compilers.
>
> I've been recently investigating (with a colleague) various performance
> matters on a subset of platforms: Power 4 and Power 5 architecture
> running on different flavors of the AIX operating system, using binaries
> built with VisualAge C/C++ . We built and ran the boost::uBlas
> regression tests using either the VisualAge C/C++ v.9 or the GCC 4.0
> compilers. Using similar levels of optimization, we observed that the
> GCC binaries were running much faster. The same tests were performed on
> a Linux 64-bits machine using a standard Core2 processor and GCC-4.2 and
> performance was again much much higher.
>
> Are we missing something here, something that could allow us to get a
> decent level of performance with the IBM compilers ?
>
> Thanks,
> Eloi
>
>
>
>

-- 
Eloi Gaudry
Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM
Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net