User avatar
ric96
Posts: 1253
Joined: Sun Mar 17, 2013 6:03 am
Location: NOIDA, India
Contact: Website

64bit vs 32bit benchmark. Eben was right

Thu Jun 23, 2016 8:42 pm

I finally managed to benchmark the 64bit fedora build with the 32bit raspbian build.
essentially you could think of it as running a very powerful cortex-A7 armv7 vs a cortecx-A53 armv8

This test consisted of compiling ffmpeg with the -j4 tag and then noting down the duration it took to complete its compilation.
I don't believe much in synthetic benchmarks and wanted to have as close as possible real world test.

So here are the results:
Aarch64 (64bit): 36 min
Arm32 (32bit): 23 min
... wait WHAT!!!

Yup, the optimization of the Raspbian OS pays off fairly well.

Although their are benefits of 64bit on some very specific programs, I really don't think that it would be very beneficial.
As of now rpi3 is good with having a 32bit OS, although a 64bit os would be a good option for such experimentation.

Conclusion: Cortex-A53 is a powerful 32bit processor.

For anyone who wants to test this 64bit OS: https://www.kraxel.org/blog/2016/04/fed ... i-updates/

thnx.
My apologies for shameless YouTube Plugs...
youtube.com/sahajsarup
twitter @sahajsarup
skype srics1996
e-mail: [email protected]
Blog: http://www.geektillithertz.com/wordpress
Web: http://www.geektillithertz.com

User avatar
bstrobl
Posts: 97
Joined: Wed Jun 04, 2014 8:31 pm
Location: Germany

Re: 64bit vs 32bit benchmark. Eben was right

Thu Jun 23, 2016 9:39 pm

There should not be such a massive discrepancy between bitness so I think comparing the same version of linux distro makes more sense (i.e. 32 bit Arch to 64 bit Arch).

Regardless, due to the 1 GiB RAM limit and overheating issues with the Pi 3, I think the foundation will release a Pi 4 with more efficient Cortex A-35 or even A-32 cores next February. The A-32 cores would be 32 bit only but they would squeeze out another 10% performance by ditching 64 bit instructions.

32 bit is here to stay as the main issue right now is a lack of RAM, not CPU or GPU performance. Well that and the USB 2.0 bottleneck.

bullen
Posts: 283
Joined: Sun Apr 28, 2013 2:52 pm

Re: 64bit vs 32bit benchmark. Eben was right

Fri Jun 24, 2016 5:47 am

Yes, and since they are stuck at 1GB RAM with this architecture, next new Pi (not A+ or zero mods) will probably be a complete overhaul... if it's not USB3 I will be very sad, 32 or 64 is meh...
https://github.com/tinspin/rupy - A tiny Java async HTTP application server.

User avatar
MarkHaysHarris777
Posts: 1820
Joined: Mon Mar 23, 2015 7:39 am
Location: Rochester, MN
Contact: Website

Re: 64bit vs 32bit benchmark. Eben was right

Fri Jun 24, 2016 6:05 am

The issue with 64 bit is not performance per se NOT EVEN.

... the issue is entirely computational power (large numbers without big num libraries) and memory capacity. Just to let you folks know (if you're not aware by now already) I'm running a PineA64 as my primary desk PC for the past three weeks... and it blows the doors off the PI; no brag, just fact. And, its not about performance in the 64 bit instruction set... its about 2Gb of memory !! ... and I'd like to see it at 4Gb.

They're getting their act together at the Pine64 team... yes, the PI came in first in the survey, but the PineA64 came in 7th... and for the price-point its ten times the board. The PI team needs to make Raspbian 64 bit now... and the PI 4B needs to bring more of the GPIO pins to the surface, and leverage 2-4Gb of main memory (and honestly, I'd lean into 4Gb if I were on the engineering team responsible for it).

If not, sooner or later (and probably sooner) the PineA64 board, or another, is going to burry the PI.

edit: PS... and who cares (seriously) if the board is a little more expensive ? ... and nobody cares if its just a bit larger either... just saying.
marcus
:ugeek:

fanoush
Posts: 464
Joined: Mon Feb 27, 2012 2:37 pm

Re: 64bit vs 32bit benchmark. Eben was right

Fri Jun 24, 2016 6:28 am

ric96 wrote:benchmark the 64bit fedora build with the 32bit raspbian build.
apples and oranges
could you compare 64bit fedora with 32bit fedora instead?
And maybe also 64bit fedora (including 64bit user space) with 64 bit fedora kernel with 32bit userspace with 32bit fedora kernel with 32bit userspace?

mfa298
Posts: 1387
Joined: Tue Apr 22, 2014 11:18 am

Re: 64bit vs 32bit benchmark. Eben was right

Fri Jun 24, 2016 9:43 am

ric96 wrote:I finally managed to benchmark the 64bit fedora build with the 32bit raspbian build.
essentially you could think of it as running a very powerful cortex-A7 armv7 vs a cortecx-A53 armv8

This test consisted of compiling ffmpeg with the -j4 tag and then noting down the duration it took to complete its compilation.
I don't believe much in synthetic benchmarks and wanted to have as close as possible real world test.

So here are the results:
Aarch64 (64bit): 36 min
Arm32 (32bit): 23 min
... wait WHAT!!!

Yup, the optimization of the Raspbian OS pays off fairly well.
That's too many variables and not enough information to say anything useful.

What else was running on the system (what was the ram usage from other stuff like)
What options did the ffmpeg ./configure script include on both.

It's quite possible that the fedora install had to use swap at times and/or had extra parts of ffmpeg included meaning that it had more work to do.

If you want to do a test like this you have to do a lot of work to ensure that everything else gives a fair comparison. You test is like saying a bike is faster than a car based on friends getting to your house without specifying where they started from (if the bike only had to travel 1 mile compared to 100 miles for the car then it will get there quicker, but that doesn't mean it travels faster).

ejolson
Posts: 3423
Joined: Tue Mar 18, 2014 11:47 am

Re: 64bit vs 32bit benchmark. Eben was right

Fri Jun 24, 2016 10:56 am

ric96 wrote:So here are the results:
Aarch64 (64bit): 36 min
Arm32 (32bit): 23 min
... wait WHAT!!!
You have compared the time it takes using different versions of the gnu compiler targeting different machine architectures to compile ffmpeg likely building different codecs auto configured based on what libraries are installed by default on each system. You may also want to check that sdcard speed is the same for both systems.

The fact that the more advanced compiler creating code for an architecture with more optimization opportunities took longer to create the binary is not surprising. More interesting would be the relative speeds of the resulting ffmpeg binaries when transcoding the same video source to the same output codec.

While slowly generating fast executables is much different than quickly generating slow executables, if you are concerned about speed of compilation, turn off all compiler optimization or use clang again with optimization turned off.

User avatar
bstrobl
Posts: 97
Joined: Wed Jun 04, 2014 8:31 pm
Location: Germany

Re: 64bit vs 32bit benchmark. Eben was right

Fri Jun 24, 2016 8:05 pm

MarkHaysHarris777 wrote:The issue with 64 bit is not performance per se NOT EVEN.

... the issue is entirely computational power (large numbers without big num libraries) and memory capacity. Just to let you folks know (if you're not aware by now already) I'm running a PineA64 as my primary desk PC for the past three weeks... and it blows the doors off the PI; no brag, just fact. And, its not about performance in the 64 bit instruction set... its about 2Gb of memory !! ... and I'd like to see it at 4Gb.

They're getting their act together at the Pine64 team... yes, the PI came in first in the survey, but the PineA64 came in 7th... and for the price-point its ten times the board. The PI team needs to make Raspbian 64 bit now... and the PI 4B needs to bring more of the GPIO pins to the surface, and leverage 2-4Gb of main memory (and honestly, I'd lean into 4Gb if I were on the engineering team responsible for it).

If not, sooner or later (and probably sooner) the PineA64 board, or another, is going to burry the PI.

edit: PS... and who cares (seriously) if the board is a little more expensive ? ... and nobody cares if its just a bit larger either... just saying.
If the Pine A64 works for you then that is great! The RPi foundation is not going to win a war with others over sheer performance hardware at the cheapest price however. I actually think that power consumption and stability as well as the excellent software support are more important and prefer the Foundation to stick with that. Once Desktop usage becomes fast enough the focus should be on tweaking every last bit in software and providing a solid target for software development.
That and constantly reducing electric usage for small projects can be very valuable for hobbyists and embedded system suppliers. Oh and the excellent backward compatibility and support that already exists :) .

Heater
Posts: 13111
Joined: Tue Jul 17, 2012 3:02 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 12:43 am

Meh, never mind the performance. I hate to think of all those 64 bits worth of transistors going to waste.

Besides, some software, like data bases, needs 64 bits to work in. Even if there is far less real RAM available than 64 bits would call for.

java
Posts: 226
Joined: Mon Jul 21, 2014 9:41 am

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 7:42 am

This topic reminds me of people putting an over capicity turbo charger on a tiny engine and expecting miracles ...

Heater
Posts: 13111
Joined: Tue Jul 17, 2012 3:02 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 9:11 am

Yeah, except in this case we have a huge engine surrounded by a little car. Quad core 64 bit processor only 1GB RAM and terrible I/O throughput.

I do love a car analogy.

User avatar
bitbank
Posts: 252
Joined: Sat Nov 07, 2015 8:01 am
Location: Sarasota, Florida
Contact: Website

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 10:04 am

Your conclusions (and some of other posters) are not correct.

Here are the false assumptions:

1) 64-bit ARM systems need more than 1GB RAM to be useful/effective. The instruction encodings are still 32-bit and using the same size data sets means the memory requirements are basically the same as 32-bit systems.

2) The large address space is the biggest advantage of 64-bit systems. The large address space is helpful for some specific problems, but the biggest advantages are additional registers and improved instruction set.

Similar to the advantages of AMD64 vs x86, ARM64 versus ARM32 brings more powerful instructions (e.g. integer divide), more registers (twice as many general purpose and SIMD). This allows compilers to generate better code since more variables can be kept in register.

To see how this plays in the real world, I created a benchmark app which runs various functions on 32 and 64-bit ARM systems. I tested the code on a RPi3 and a Dragonboard410c (same CPU architecture). You can access the github repo here: https://github.com/bitbank2/gcc_perf/
The fastest code is none at all :)

Heater
Posts: 13111
Joined: Tue Jul 17, 2012 3:02 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 10:15 am

Off the top of my head:

Mongo DB likes 64 bit addressing. I guess it can then map your entire database into a linear address space without any extra juggling. The data need not be in RAM most of it will be on file somewhere.

Google's Go language likes 64 bit addressing. Something to do with the performance of it's garbage collector. That is an issue that may or may not have been fixed since I looked at Go a couple of years back.

OpenSSL generates expired certificates on 32 bit systems if the expiry data you ask for is after 2038, when the unix tick overflows. Again that issue may be fixed since it caught me out a couple of years back.

User avatar
bstrobl
Posts: 97
Joined: Wed Jun 04, 2014 8:31 pm
Location: Germany

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 10:26 am

While there are definite advantages to a 64 bit system the problem is currently that, unless relative addressing is used, pretty much all pointers are going to be 64 bit and hence there will be more cache pressure and higher RAM use even with 32 bit instructions. This is problematic on a system with only 1 GB of working memory where things will still be counted in megabytes.

I do hope that AArch64 compiler improvements will negate this problem in the future but that may still take some time.

Slackware
Posts: 131
Joined: Mon Jan 18, 2016 3:45 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 10:19 pm

Would be nice to just let the 64 bits run TWO instructions side by side making an 8 core.
Of course we would have to call it the Raspberry Spider......

User avatar
davidcoton
Posts: 4033
Joined: Mon Sep 01, 2014 2:37 pm
Location: Cambridge, UK

Re: 64bit vs 32bit benchmark. Eben was right

Sat Jun 25, 2016 10:59 pm

Slackware wrote:Would be nice to just let the 64 bits run TWO instructions side by side making an 8 core.
Of course we would have to call it the Raspberry Spider......
That would solve any problems browsing the Web...
Signature retired

asandford
Posts: 1997
Joined: Mon Dec 31, 2012 12:54 pm
Location: Waterlooville

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 12:14 am

MarkHaysHarris777 wrote: The PI team needs to make Raspbian 64 bit now... and the PI 4B needs to bring more of the GPIO pins to the surface, and leverage 2-4Gb of main memory (and honestly, I'd lean into 4Gb if I were on the engineering team responsible for it).

If not, sooner or later (and probably sooner) the PineA64 board, or another, is going to burry the PI.

edit: PS... and who cares (seriously) if the board is a little more expensive ? ... and nobody cares if its just a bit larger either... just saying.
The RAM limit is due to the GPU, so unless a VC4+ arrives with >1GB support, it ain't gonna happen (the soc is a GPU with a CPU bolted on as a feature).

mfa298
Posts: 1387
Joined: Tue Apr 22, 2014 11:18 am

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 9:26 am

Slackware wrote:Would be nice to just let the 64 bits run TWO instructions side by side making an 8 core.
Of course we would have to call it the Raspberry Spider......
Doesn't really work like that unfortunately.
(Hopefully obvious to most but some might think it could become an 8 core cpu like that)

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23382
Joined: Sat Jul 30, 2011 7:41 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 11:38 am

1. If you are running a database that is big enough to work better with 64bit, why are you running it on a Pi?
2. Although pointers will be 64bit, most data they point to isn't, and the number of pointers in code is a lot lees than overall data. So the memory hit is fairly insignificant.
3. If you want multiple instructions to run at the same time, you already have the NEON SIMD instructions.
4. 1GB of RAM is a LOT. Much more than most embedded systems, and they already achieve a lot. If you are writing code that needs more, look to your algorithms rather than a machine with more RAM, not much code actually needs that much memory.
5. A lot of cars nowadays have small engines with turbochargers (Fiat 500 etc).
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5318
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 11:57 am

jamesh wrote: 2. Although pointers will be 64bit, most data they point to isn't, and the number of pointers in code is a lot lees than overall data. So the memory hit is fairly insignificant.
It may be more than that. From here. While the increased memory usage may not be critical it may make your instruction and data caches less effective.

Code: Select all

Binary                  ARMv7 Size (Bytes)     ARMv8 Size (Bytes)        Ratio
libcrypto.so 	        1,052,920              1,673,400                 1.59x
toolbox Android 5.1 	 150,836                255,280                   1.69x

User avatar
bstrobl
Posts: 97
Joined: Wed Jun 04, 2014 8:31 pm
Location: Germany

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 12:15 pm

dom wrote:
jamesh wrote: 2. Although pointers will be 64bit, most data they point to isn't, and the number of pointers in code is a lot lees than overall data. So the memory hit is fairly insignificant.
It may be more than that. From here. While the increased memory usage may not be critical it may make your instruction and data caches less effective.

Code: Select all

Binary                  ARMv7 Size (Bytes)     ARMv8 Size (Bytes)        Ratio
libcrypto.so 	        1,052,920              1,673,400                 1.59x
toolbox Android 5.1 	 150,836                255,280                   1.69x
Was about to post the same link. I have a VPS with only 128 MB RAM and the difference in wired usage between a clean 32 and 64 bit install is already at 10+ MB which is rather massive for such a tiny machine.

I hope to see a Raspbian 64 bit build in the future with exact same settings in order to make a direct comparison.

ejolson
Posts: 3423
Joined: Tue Mar 18, 2014 11:47 am

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 3:45 pm

jamesh wrote:1. If you are running a database that is big enough to work better with 64bit, why are you running it on a Pi?
2. Although pointers will be 64bit, most data they point to isn't, and the number of pointers in code is a lot lees than overall data. So the memory hit is fairly insignificant.
While RAM may only by 1GB, the sdcard is usually greater than 4GB. While ftello64 and fseeko64 system calls can be used to randomly access files larger than 2GB, many programs use ftell and fseek. This may be for legacy and cross platform reasons or because the programs were developed exclusively on and for 64-bit systems. Note that very few people run 32-bit kernels on modern Intel compatible machines these days. The extra effort to make the code work properly on 32-bit machines is about as fruitful as coding compatibility for big-endian architectures.

jahboater
Posts: 4605
Joined: Wed Feb 04, 2015 6:38 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 4:51 pm

I believe if you set _FILE_OFFSET_BITS to 64, all file handling will be 64 bits wide even on 32 bit platforms; it is the default on 64 bits.
http://www.gnu.org/software/libc/manual ... acros.html

In a quick and unscientific comparison - a program on aarch64 was around 10% larger executable size and 10% faster compared to 32 bit ARM.

Heater
Posts: 13111
Joined: Tue Jul 17, 2012 3:02 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 5:46 pm

Jamesh,
If you are running a database that is big enough to work better with 64bit, why are you running it on a Pi?
Why not? Mongodb, for example, is limited to 2GB of data on 32 bit machines. That because they keep their code smaller, simpler, cleaner and hopefully less bug prone by using memory mapped files. 2GByte is not a big data base. Is it? Besides, surely it's not the size of the DB that may be an issue, SD cards are huge now a days, but rather the expected read/update rate on a machine with limited file system bandwidth.
Although pointers will be 64bit, most data they point to isn't, and the number of pointers in code is a lot lees than overall data. So the memory hit is fairly insignificant.
Not necessarily. A huge tree or other structure may well have more pointers than data. Javascript uses 64 bits for it's numbers exclusively.
1GB of RAM is a LOT. Much more than most embedded systems, and they already achieve a lot.
It's certainly enough for my expected uses of a Pi. Should I ever really need more there are other machines available.

On balance, I'd love to see a 64 bit Raspbian for when it's needed.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23382
Joined: Sat Jul 30, 2011 7:41 pm

Re: 64bit vs 32bit benchmark. Eben was right

Sun Jun 26, 2016 7:04 pm

dom wrote:
jamesh wrote: 2. Although pointers will be 64bit, most data they point to isn't, and the number of pointers in code is a lot lees than overall data. So the memory hit is fairly insignificant.
It may be more than that. From here. While the increased memory usage may not be critical it may make your instruction and data caches less effective.

Code: Select all

Binary                  ARMv7 Size (Bytes)     ARMv8 Size (Bytes)        Ratio
libcrypto.so 	        1,052,920              1,673,400                 1.59x
toolbox Android 5.1 	 150,836                255,280                   1.69x
Interesting - thanks Dom. I would not have expected such a large difference. I wonder why it is so big - would imply a LOT of pointers/data items going to 64bit! Lot of use of int used instead of int32_t? I guess native length is faster, even when you only use 8, 16 or 32 bits of it.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

Return to “General discussion”