jorisvergeer
Posts: 6
Joined: Wed Jul 18, 2012 3:33 pm

Compiler options

Wed Jul 18, 2012 3:49 pm

The raspbian FAQ states that the following flags should be used:

Code: Select all

-march=armv6
-mfpu=vfp
-mfloat-abi=hard
Shouldn't they be these for the best optimization?

Code: Select all

-march=armv6zk
-mcpu=arm1176jzf-s
-mtune=arm1176jzf-s
-mfpu=vfp
-mfloat-abi=hard
I am no expert but this looks like it gives gcc some more specific information about the CPU.

Does it make some difference using those flags, did i overlook something, is there more optimization possible?

I was curious why the FAQ states those generic flags while OpenElec.tv uses more specific flags when they are both optimized for the RPi.

Sources:
http://en.wikipedia.org/wiki/Raspberry_Pi (right bar with CPU)
http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
OpenElec.tv build system
http://www.raspbian.org/RaspbianFAQ

blackarchon
Posts: 20
Joined: Sat Jun 23, 2012 5:39 pm

Re: Compiler options

Wed Jul 18, 2012 7:40 pm

If I would know how to compile the kernel and all other packages with these compiler switches, I would try it, just out of curiosity.

User avatar
mpthompson
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
Contact: Website

Re: Compiler options

Wed Jul 18, 2012 10:30 pm

jorisvergeer wrote:Shouldn't they be these for the best optimization?

Code: Select all

-march=armv6zk
-mcpu=arm1176jzf-s
-mtune=arm1176jzf-s
-mfpu=vfp
-mfloat-abi=hard
I am no expert but this looks like it gives gcc some more specific information about the CPU.
The options chosen for the Raspbian were done very early in the project when it wasn't clear we would be able to successfully rebuild the Debian armhf packages for the armv6 processor. For better or worse, the options stuck at the more generic armv6 setting. We discussed changing them, but the consensus between plugwash and myself is that it's better to loose some optimizations and keep the Raspbian code more generic to other armv6 devices that may come out. The theory is that while narrowing the CPUs Raspbian is supported on is easy, broadening the CPU requirements in the future to target other armv6 CPUs would be virtually impossible if all code was compiled specifically for the arm1176jzf-s.

If someone could make a good case (with proof) that significant performance gains could be made in specific packages by more narrowly specifying the arm1176jzf-s device, this decision could be revisited.

blackarchon
Posts: 20
Joined: Sat Jun 23, 2012 5:39 pm

Re: Compiler options

Thu Jul 19, 2012 8:19 am

mpthompson,

I'm not quite sure if you have posted this information before, but is there a guide/howto on how you started with raspbian? What were your first steps? If someone (I would definitely try it) could recompile all of the raspbian source packages with these more optimized gcc parameters, we would have an apples to apples comparison.

jorisvergeer
Posts: 6
Joined: Wed Jul 18, 2012 3:33 pm

Re: Compiler options

Thu Jul 19, 2012 3:32 pm

I did some research:

I compiled fftw3.2.2 with both the raspbian compiler on the raspberry pi and my optimized toolchain.
It comes with a small benchmarking tool which i static linked with all dependencies (incl. libc and libm)

So i got a benchmarking tool build with the original compiler/libc/libm and one with optimized compiler and libraries.

I ran both executables 9 times using:

Code: Select all

time ./bench -s 320x640
Every run takes about 9 seconds.

I got the following average times:

Code: Select all

       Raspbian	 Optimized
Real	8,9635sec	8,8972sec
User	8,8955sec	8,8244sec
Sys	 0,0377sec	0,04333sec
Which concludes that the battle between Raspbians GCC4.6.3 and my GCC4.7.1 end in a tie with an insignificant speed improvement of 0.8% (in my favor :P)

Maybe not a really good benchmark but i don't think a huge speed improvement can be made. I don't know what gcc version is used to build the binaries but it still might be an idea to use the latest GCC toolchain with optimization options. In best case it can maybe improve speed 1-2%. The binaries are compatible as long as the same eglibc version is used and en kernel version of the toolchain is not newer than the used kernel. (correct me if a'm wrong)

Btw. I used crosstool-ng to configure my toolchain. It seems to work fine for me.

blackarchon
Posts: 20
Joined: Sat Jun 23, 2012 5:39 pm

Re: Compiler options

Thu Jul 19, 2012 8:32 pm

The Raspbian kernel seems to be build with GCC 4.5.x, which is quite old. Maybe a build with 4.7.1 would bring some improvements.

User avatar
mpthompson
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
Contact: Website

Re: Compiler options

Thu Jul 19, 2012 8:37 pm

jorisvergeer wrote:Which concludes that the battle between Raspbians GCC4.6.3 and my GCC4.7.1 end in a tie with an insignificant speed improvement of 0.8% (in my favor :P)
Thanks for looking deeper into this. Given the huge effort required to rebuild what is already built for Raspbian, I don't see us making any changes. Perhaps plugwash will see this thread and weigh in as well.

User avatar
mpthompson
Posts: 620
Joined: Fri Feb 03, 2012 7:18 pm
Location: San Carlos, CA
Contact: Website

Re: Compiler options

Thu Jul 19, 2012 8:44 pm

blackarchon wrote:I'm not quite sure if you have posted this information before, but is there a guide/howto on how you started with raspbian? What were your first steps? If someone (I would definitely try it) could recompile all of the raspbian source packages with these more optimized gcc parameters, we would have an apples to apples comparison.
There are links that pretty well cover the entire history of Raspbian on our About wiki page: http://www.raspbian.org/RaspbianAbout

An entire recompile would be a non-trivial task, but anyone is welcome to our source code to try it. It's all in the Raspbian repository.

madtom1999
Posts: 41
Joined: Mon Jul 29, 2013 4:37 pm

Re: Compiler options

Fri Nov 11, 2016 9:22 am

Is there anywhere there is a list of (raspbian) compiler switches suitable for maximum performance or possible optimisations for each version of the pi? These would be really helpful for a lot of us for getting other packages working.

jahboater
Posts: 1871
Joined: Wed Feb 04, 2015 6:38 pm

Re: Compiler options

Fri Nov 11, 2016 11:35 am

madtom1999 wrote:Is there anywhere there is a list of (raspbian) compiler switches suitable for maximum performance or possible optimisations for each version of the pi? These would be really helpful for a lot of us for getting other packages working.
This is what I use:

Pi Zero (armv6)
-mcpu=arm1176jzf-s -mfpu=vfp

Pi2 (armv7)
-mcpu=cortex-a7 -mfpu=neon-vfpv4 -mneon-for-64bits
(add -mthumb for a much smaller executable).

Pi3 and Pi2 V1.2 (armv8)
-mcpu=cortex-a53 -mfpu=neon-fp-armv8 -mneon-for-64bits
(change to "march=armv8-a+crc -mfpu=neon-fp-armv8 -mtune=cortex-a53" if you need the crc extensions).

Pi3 (armv8 in 64-bit mode)
-mcpu=cortex-a53

The neon-for-64bits flag "encourages" the compiler to make more use of NEON for 64-bit integer arithmetic. It will use NEON anyway for hard things like 64-bit shifts, and also if it happens to have the value already in a NEON register.

There are no fpu flags allowed for aarch64 because the presence of NEON is guaranteed (and obviously "neon-for-64bits" is pointless!). NEON in 64-bit mode is fully IEE-754 compliant.

Armv6 only supports thumb1 (16-bit instructions only) which is too limited to be useful. Armv7 supports thumb2 which is both 16 and 32 bits and is complete (some ARM processors only accept thumb2). Armv8 partially deprecates thumb2, so I avoid it. Aarch64 instructions are always 32 bits.

There is no need for -march or -mtune.

A good optimization is to use a late version of GCC because much work has been done for ARM recently. The current version is 6.2.

Perhaps something like this should be in a sticky?
Last edited by jahboater on Wed Nov 16, 2016 6:45 pm, edited 3 times in total.

madtom1999
Posts: 41
Joined: Mon Jul 29, 2013 4:37 pm

Re: Compiler options

Sat Nov 12, 2016 8:54 am

Thanks for those. Is the pi0 the zero or an earlier pi?
Can we get something from the board itself to tell us what it is to match to the switches?
Last edited by madtom1999 on Sat Nov 12, 2016 9:08 am, edited 1 time in total.

User avatar
rpdom
Posts: 11705
Joined: Sun May 06, 2012 5:17 am
Location: Essex, UK

Re: Compiler options

Sat Nov 12, 2016 9:05 am

madtom1999 wrote:Thanks for those. For clarity what does the zero 'use'?
jahboater wrote:Pi0 (armv6)
-mcpu=arm1176jzf-s -mfpu=vfp
or were you asking about something different?

madtom1999
Posts: 41
Joined: Mon Jul 29, 2013 4:37 pm

Re: Compiler options

Sat Nov 12, 2016 9:15 am

I updated my post as you answered it!
Thanks

jahboater
Posts: 1871
Joined: Wed Feb 04, 2015 6:38 pm

Re: Compiler options

Sat Nov 12, 2016 9:19 am

rpdom wrote:
madtom1999 wrote:Thanks for those. For clarity what does the zero 'use'?
jahboater wrote:Pi0 (armv6)
-mcpu=arm1176jzf-s -mfpu=vfp
or were you asking about something different?
Sorry, edited to "Pi Zero".
The Pi3 entry now also refers to the new Pi2 V1.2.

By the way, does anyone know a nicer way of detecting Pi's in the makefile?

Code: Select all

machine = $(shell sh -c 'uname -m 2>/dev/null || echo unknown')

# Raspberry Pi B+, Zero, etc 
ifneq (,$(findstring armv6l,$(machine)))
  CPU = -mcpu=arm1176jzf-s
  FPU = -mfpu=vfp
endif

# Raspberry Pi 2 and 3 
ifneq (,$(findstring armv7l,$(machine)))
  model = $(shell sh -c 'cat /sys/firmware/devicetree/base/model 2>/dev/null || echo unknown')
ifneq (,$(findstring 3,$(model)))
  CPU = -mcpu=cortex-a53
  FPU = -mfpu=neon-fp-armv8
else
  CPU = -mcpu=cortex-a7 -mthumb
  FPU = -mfpu=neon-vfpv4
endif
  FPU += -mneon-for-64bits
endif

# ARM A64
ifneq (,$(findstring aarch64,$(machine)))
  CPU = -mcpu=cortex-a53
  PLATFORM += -mabi=lp64 -mcmodel=tiny
endif

madtom1999
Posts: 41
Joined: Mon Jul 29, 2013 4:37 pm

Re: Compiler options

Sun Nov 13, 2016 7:03 pm

Thanks very much for that.

fsck
Posts: 26
Joined: Mon Feb 23, 2015 4:49 pm

Re: Compiler options

Wed Nov 16, 2016 2:02 am

jahboater wrote:
madtom1999 wrote: Pi3 and Pi2 V1.2 (armv8)
-mcpu=cortex-a53 -mfpu=neon-fp-armv8 -mneon-for-64bits

Pi3 (armv8 in 64-bit mode)
-mcpu=cortex-a53
For the Pi 3, instead of -mcpu=cortex-a53 you should use -march=armv8-a+crc -mfpu=neon-fp-armv8 -mtune=cortex-a53, otherwise the crc32 extensions of ARMv8 that the Pi 3 has won't get enabled (because it's an optional feature in ARMv8a).

I have also noticed that GCC6 requires you to remove -mcpu from the command line if you use -march and -mtune, otherwise you get an error message. AFAIK in earlier versions GCC would simply ignore the -mcpu flag in that scenario.

Oh, and in case anyone's wondering, the Pi 3 does NOT implement crypto FPU extensions. BCM2837 It's the only Cortex-A53 implementation that I've heard of that doesn't have it. Rumor has it it's because ARM charges extra to license it.

Didn't know about -mneon-for-64bits, need to test it. From my previous testing with Linpack benchmarks, -Ofast -mvectorize-with-neon-quad gave me the best peformance.

GCC does give a reason why it's not enabled by defaut, though:
Use of Advanced SIMD (Neon) for 64-bit scalar computations has been disabled by default. This was found to generate better code in only a small number of cases.
Last edited by fsck on Wed Nov 16, 2016 3:01 pm, edited 1 time in total.

jahboater
Posts: 1871
Joined: Wed Feb 04, 2015 6:38 pm

Re: Compiler options

Wed Nov 16, 2016 5:54 am

You need the '-' in -march=armv8a+crc

Code: Select all

-march=armv8-a+crc -mfpu=neon-fp-armv8 -mtune=cortex-a53
Works fine for gcc 6.2

Thanks

jahboater
Posts: 1871
Joined: Wed Feb 04, 2015 6:38 pm

Re: Compiler options

Wed Nov 16, 2016 6:20 pm

fsck wrote:Didn't know about -mneon-for-64bits, need to test it. From my previous testing with Linpack benchmarks, -Ofast -mvectorize-with-neon-quad gave me the best peformance.
Its only of interest if you are doing any 64-bit integer arithmetic.
GCC does give a reason why it's not enabled by default, though:
Use of Advanced SIMD (Neon) for 64-bit scalar computations has been disabled by default. This was found to generate better code in only a small number of cases.
The gcc man page also says this:
-mneon-for-64bits
Enables using Neon to handle scalar 64-bits operations. This is disabled by default
since the cost of moving data from core registers to Neon is high.
I wonder if that's still true? Pure speculation, but NEON is now an integral part of the CPU instead of a co-processor - may make transfers quicker, I don't know. Also NEON is now faster and its quad issue.

I like "-mneon-for-64bits" simply because it makes the code a good bit smaller, and it seems to make good choices about when to use Neon instead of ARM.

enderandrew
Posts: 3
Joined: Fri Nov 11, 2016 7:53 pm

Re: Compiler options

Wed Nov 16, 2016 6:49 pm

jahboater wrote:You need the '-' in -march=armv8a+crc

Code: Select all

-march=armv8-a+crc -mfpu=neon-fp-armv8 -mtune=cortex-a53
Works fine for gcc 6.2

Thanks
Has anyone done a total recompile of the OS with GCC 6.2 and optimized flags to see if it has any benefit? As it is stated above this is non-trivial, but presumably you can cross-compile on beefy hardware and then copy the compiled files to your Pi when you're done.

If there is benefit, I'd love to see someone either share an image of a recompiled Raspbian optimized for the Pi 3 and/or a repository where you could get optimized packages.

fsck
Posts: 26
Joined: Mon Feb 23, 2015 4:49 pm

Re: Compiler options

Mon Nov 21, 2016 8:23 am

enderandrew wrote:
jahboater wrote:You need the '-' in -march=armv8a+crc

Code: Select all

-march=armv8-a+crc -mfpu=neon-fp-armv8 -mtune=cortex-a53
Works fine for gcc 6.2

Thanks
Has anyone done a total recompile of the OS with GCC 6.2 and optimized flags to see if it has any benefit? As it is stated above this is non-trivial, but presumably you can cross-compile on beefy hardware and then copy the compiled files to your Pi when you're done.

If there is benefit, I'd love to see someone either share an image of a recompiled Raspbian optimized for the Pi 3 and/or a repository where you could get optimized packages.
I've recompiled most of OSMC with GCC 6.2 optimized for Pi3 and there were some gains in media performance. Same with RetroPie. Raspbian software packages are definitely sub-optimal on anything other than the Pi 1 and Zero. It's easy to recompile the kernel, but since Raspbian is Debian-based and uses its package system you'd have to recompile your packages manually. If you want a fully optimized OS, use something like Gentoo or Arch Linux because it makes it easy to install everything from source.

jahboater
Posts: 1871
Joined: Wed Feb 04, 2015 6:38 pm

Re: Compiler options

Mon Nov 21, 2016 10:45 am

fsck wrote:I've recompiled most of OSMC with GCC 6.2 optimized for Pi3 and there were some gains in media performance. Same with RetroPie. Raspbian software packages are definitely sub-optimal on anything other than the Pi 1 and Zero. It's easy to recompile the kernel, but since Raspbian is Debian-based and uses its package system you'd have to recompile your packages manually. If you want a fully optimized OS, use something like Gentoo or Arch Linux because it makes it easy to install everything from source.
I guess if a 64-bit Raspbian is ever released it will have to be built for the modern architecture.

enderandrew
Posts: 3
Joined: Fri Nov 11, 2016 7:53 pm

Re: Compiler options

Mon Dec 12, 2016 6:20 pm

jahboater wrote:
fsck wrote:I've recompiled most of OSMC with GCC 6.2 optimized for Pi3 and there were some gains in media performance. Same with RetroPie. Raspbian software packages are definitely sub-optimal on anything other than the Pi 1 and Zero. It's easy to recompile the kernel, but since Raspbian is Debian-based and uses its package system you'd have to recompile your packages manually. If you want a fully optimized OS, use something like Gentoo or Arch Linux because it makes it easy to install everything from source.
I guess if a 64-bit Raspbian is ever released it will have to be built for the modern architecture.
The 64-bit/32-bit divide would make it easier to clearly delineate a version targeted at the lowest common denominator (Pi Zero with 32-bit) and the Pi 3 with 64-bit.

Return to “Raspbian”

Who is online

Users browsing this forum: No registered users and 24 guests