Page 1 of 1

gcc compiler optimisation flags??

Posted: Fri Jun 27, 2014 4:01 pm
by redhawk
I've written a small program that does a lot of number crunching this pushes the CPU load to approximately 46% according to top.
So I tried experimenting with the compiler optimisation flags i.e. -O1 -O2 and this does appear to lower the CPU usage.
Then I tried something else (info I found online) -O2 -falign-functions=16 -falign-loops=16 and the CPU load is down to 18%.

Now I'm curious, are optimisation flags universally beneficial to all platforms or are some options better for the ARMv6 but not for say Intel??

Is there a magic formula for generating the fastest code execution on the the ARMv6 processor??

Richard S.

Re: gcc compiler optimisation flags??

Posted: Fri Jun 27, 2014 4:37 pm
by joan
https://gcc.gnu.org/onlinedocs/gcc/Opti ... ze-Options

I don't know what the machine default is for those align options.

Re: gcc compiler optimisation flags??

Posted: Fri Jun 27, 2014 5:58 pm
by riklaunim
Safe flags are like -march=native -O2 -pipe if "native" is supported. Sometimes it may be better to use -Os instead of -O2 for smaller binnary. Bit more on http://wiki.gentoo.org/wiki/Safe_CFLAGS

Using exotic or very "extreme" flags may give unstable binary, or not compile every code that compiles on default flags.

Re: gcc compiler optimisation flags??

Posted: Mon Jun 30, 2014 8:47 am
by gordon@drogon.net
Experiment with -O2, -O3 and -Os flags. I'm not sure you'll get much better than that.

However - if your program is doing a lot of number crunching, why is it only running at 46% CPU? It really ought to be running at 100% CPU usage, regardless of the flags you use. 46% CPU means your program is only using half the potential CPU cycles available to it - what else is your Pi doing?

-Gordon

Re: gcc compiler optimisation flags??

Posted: Mon Jun 30, 2014 11:27 am
by redhawk
My program is an audio DSP it captures, processes, plays back audio and then waits for the capture buffer to fill up again which is why CPU usage never gets to 100%.
Internally there are several circular buffers, loop counters and callable functions which in the end all adds up as CPU load.
Anything to reduce the overall CPU usage would beneficial to my program so I could add more DSP functions while avoiding playback skipping or affecting other running services.
By applying compiler optimisation flags -falign-functions=16 -falign-loops=16 I've found the CPU load was considerably lower even more so than -O1 or -O2 alone.
I'm not quire sure why though since I'm not familiar with ARM code but I suspect it working double time for misaligned data addresses a bit like the MOS6502 adding cycles when memory is accessed across page boundaries.

Richard S.