From: Dale Smith (DSmith_at_[hidden])
Date: 2007-03-05 15:15:35
This may be an alignment problem. I think there are specific -f/m
options that may govern this. Try -O3 with
-mpreferred-stack-boundary=???. There may be others, such as
In the second optimization level, we saw that a number of alignment
optimizations were introduced that had the effect of increasing
performance but also increasing the size of the resulting image. Three
additional alignment optimizations specific to this architecture are
available. The -malign-int option allows types to be aligned on 32-bit
boundaries. If you're running on a 16-bit aligned target, -mno-align-int
can be used. The -malign-double controls whether doubles, long doubles
and long-longs are aligned on two-word boundaries (disabled with
-mno-align-double). Aligning doubles provides better performance on
Pentium architectures at the expense of additional memory.
Stacks also can be aligned by using the option
-mpreferred-stack-boundary. The developer specifies a power of two for
alignment. For example, if the developer specified
-mpreferred-stack-boundary=4, the stack would be aligned on a 16-byte
boundary (the default). On the Pentium and Pentium Pro targets, stack
doubles should be aligned on 8-byte boundaries, but the Pentium III
performs better with 16-byte alignment.
You may be better off with -O2 and some additional -f/m options.
Experimentation is the key.
[mailto:ublas-bounces_at_[hidden]] On Behalf Of Preben Hagh Strunge
Sent: Monday, March 05, 2007 3:03 PM
To: ublas mailing list
Subject: [ublas] Processor optimization and ublas/lapack
Now, doing optimizations, I wonder if this behavior is normal:
When compiling with -O2 or debugflags the program runs smooth. There's
no problems at all and everything seems to be OK.
But when compiling with -O3 the program segfaults? Is there an error
somewhere in my program or is this behavior normal!
The experienced programmers know that -O3 sometimes break something, but
I have never seen any explanation on what could happen!
Which optimization flags are recommended to use with ublas and lapack?
I have a pentium m (1.6 GHz) (laptop).
ublas mailing list