ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 5:51 am

Heater wrote:
Fri Aug 16, 2019 5:07 pm
It's just that when you start to expect such high performance from the very cheap and humble Pi I wonder if that was ever it's intended use case.
The thing about a cheap general-purpose single-board computer marketed towards schools, makers and hobbyists is that the intended use case is very open ended. Pi computers are being used for all sorts of things not envisioned by the original engineers. That's good, because the versatility is a relief from the boring single-functions of a media player, game console, camera or microwave oven. The creativity of users thinking up new things to do with the Pi is what has led to the success in the first place.

For reference, Cortex-A53 cores that include hardware-accelerated cryptographic extensions perform AES more than 5 times faster than the same processor without those extensions. More information is in this thread. Unfortunately, the person who ran the tests was banned for being rude (and arguing about barrel connectors). Never mind that. I would expect the difference between an A72 core without the cryptographic extensions (as in the Pi 4) and one that includes those extensions might be less.

Has anyone run the corresponding OpenSSL benchmarks on the Pi 4 for comparison?

jerrm
Posts: 194
Joined: Wed May 02, 2018 7:35 pm

Re: Why moving to 64bit?

Sat Aug 17, 2019 5:08 pm

Did some https download tests.

They appear to confirm jdonald's AES_ASM issues for 64 bit.

The Debian armhf userland shows substantial improvement over Raspbian here.

Code: Select all

Baseline - no encryption 
(all three userlands are at wire speed within .05 secs of each other):
======================================================================
[email protected]:~# time curl -o /dev/null http://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0   112M      0  0:00:17  0:00:17 --:--:--  112M

real    0m17.868s
user    0m5.992s
sys     0m11.837s

=========
Raspbian:
=========
[email protected]:~# time curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  27.5M      0  0:01:12  0:01:12 --:--:-- 27.5M

real    1m12.769s
user    1m8.127s
sys     0m4.629s


=============
Debian armhf:
=============
(pi32)[email protected]:~#  time curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  44.4M      0  0:00:45  0:00:45 --:--:-- 44.2M

real    0m45.129s
user    0m40.520s
sys     0m4.600s


===============
Debian aarch64:
===============
(pi64)[email protected]:~# time curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  20.8M      0  0:01:36  0:01:36 --:--:-- 20.8M

real    1m36.068s
user    1m32.016s
sys     0m4.041s
All test using Sakaki's 7/28 image. I didn't bother swapping cards for this one.

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 5:22 pm

jerrm wrote:
Sat Aug 17, 2019 5:08 pm
Did some https download tests.

They appear to confirm jdonald's AES_ASM issues for 64 bit.

The Debian armhf userland shows substantial improvement over Raspbian here.

Code: Select all

Baseline - no encryption 
(all three userlands are at wire speed within .05 secs of each other):
======================================================================
[email protected]:~# time curl -o /dev/null http://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0   112M      0  0:00:17  0:00:17 --:--:--  112M

real    0m17.868s
user    0m5.992s
sys     0m11.837s

=========
Raspbian:
=========
[email protected]:~# time curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  27.5M      0  0:01:12  0:01:12 --:--:-- 27.5M

real    1m12.769s
user    1m8.127s
sys     0m4.629s


=============
Debian armhf:
=============
(pi32)[email protected]:~#  time curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  44.4M      0  0:00:45  0:00:45 --:--:-- 44.2M

real    0m45.129s
user    0m40.520s
sys     0m4.600s


===============
Debian aarch64:
===============
(pi64)[email protected]:~# time curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  20.8M      0  0:01:36  0:01:36 --:--:-- 20.8M

real    1m36.068s
user    1m32.016s
sys     0m4.041s
All test using Sakaki's 7/28 image. I didn't bother swapping cards for this one.
Thanks for the test. It would appear open-source developers have assumed anyone interested in the performance of AES encryption on 64-bit will purchase an ARM chip with hardware cryptographic extensions. As a result, there is no assembler optimised code that works on the Pi in 64-bit mode.

Writing a 64-bit ARM assembly implementation of AES that doesn't use the optional ARMv8 cryptographic extensions would appear low-hanging fruit for any programmer who wants to make a contribution to open source.

As such code is important for the performance of AES when running on the Pi in 64-bit mode, maybe the foundation could motivate the writing of such code by issuing a bounty: Improve the speed of AES encryption when running on the Raspberry Pi in 64-bit mode by a factor of two.

A prize of a Pi 4B with 8GB RAM should be more than sufficient.

pica200
Posts: 138
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 5:34 pm

See a few posts above. This code already exists but is not used currently. Seems like no one noticed because there are barely SoCs without acceleration :roll:

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 5:36 pm

pica200 wrote:
Sat Aug 17, 2019 5:34 pm
See a few posts above. This code already exists but is not used currently. Seems like no one noticed because there are barely SoCs without acceleration :roll:
Maybe the code is not enabled because it doesn't work.

Woohoo! If I get it to work, do you think I could collect the bounty?

pica200
Posts: 138
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 5:40 pm

Or if you feel adventurous solder in a 64 Gbit one ;)

jdonald
Posts: 413
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Sat Aug 17, 2019 6:28 pm

jerrm, much appreciated for the more thorough benchmarking.

I have not been successful getting arm64 openssl to compile with -DAES_ASM=1. On the bright side, in the investigation I came across other settings to play with via the OPENSSL_armcap environment variable. These are flags that are normally auto-detected but can be forced at runtime:

Code: Select all

# define ARMV7_NEON      (1<<0)
# define ARMV7_TICK      (1<<1)
# define ARMV8_AES       (1<<2)
# define ARMV8_SHA1      (1<<3)
# define ARMV8_SHA256    (1<<4)
# define ARMV8_PMULL     (1<<5)
# define ARMV8_SHA512    (1<<6)
ARMV7_NEON is turned on by default and necessary for good 32-bit AES performance. Somehow it hurts 64-bit AES performance with the GCM ciphers though, and turning it off results in a consistent 57% speedup in the 256-bit case.

Code: Select all

(pi64)[email protected]:~ $ OPENSSL_armcap=0 openssl speed -evp aes-256-gcm # disable ARMV7_NEON
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-gcm      32677.64k    35964.69k    36898.56k    37191.34k    37464.75k    37497.51k
(pi64)[email protected]:~ $ openssl speed -evp aes-256-gcm # vs default 64-bit
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-gcm      21241.37k    22912.49k    23474.86k    23605.25k    23516.50k    23529.40k
I don't see the same effect on CBC ciphers (edit: actually the effect is also there with -evp aes-256-cbc, oddly enough). The point is that there may be plenty of low-hanging fruit.

Note that forcing on a flag such as ARMV8_AES will immediately print "Illegal instruction" if testing an AES cipher, just for more confirmation that the Pi 4 lacks this extension.

ejolson: would be great if you could take a look at the OpenSSL code and try to get -DAES_ASM=1 working with arm64.

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 7:43 pm

jdonald wrote:
Sat Aug 17, 2019 6:28 pm
ARMV7_NEON is turned on by default and necessary for good 32-bit AES performance. Somehow it hurts 64-bit AES performance with the GCM ciphers though, and turning it off results in a consistent 57% speedup in the 256-bit case.
In case nobody has posted the output already, here is the benchmark for openssl on Raspbian running on a stock Pi 4B.

Code: Select all

$ openssl speed -evp aes-256-gcm
Doing aes-256-gcm for 3s on 16 size blocks: 5178850 aes-256-gcm's in 2.99s
Doing aes-256-gcm for 3s on 64 size blocks: 1435097 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 374117 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 94350 aes-256-gcm's in 2.99s
Doing aes-256-gcm for 3s on 8192 size blocks: 11839 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 16384 size blocks: 5927 aes-256-gcm's in 3.00s
OpenSSL 1.1.1c  28 May 2019
built on: Thu May 30 15:27:48 2019 UTC
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) 
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-hL5TK7/openssl-1.1.1c=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-gcm      27712.91k    30615.40k    31924.65k    32312.51k    32328.36k    32369.32k
Therefore, by turning off NEON it appears you are already beating the 32-bit executable in Raspbian.

For comparison I obtained some out-of-the-box results for different ARM-based single-board computers. Note that the speed with 16384 bytes has been omitted in some cases for consistency.

Code: Select all

ARM1176JZF-S 32-bit 700MHz noAES (Pi B+ Raspbian):
aes-256-gcm       5283.77k     6570.91k     6961.58k     7102.24k     7099.73k
Cortex-A53 32-bit 1.4GHz noAES (Pi 3B+ Raspbian):
aes-256-gcm      19046.13k    22753.49k    24055.64k    24405.67k    24510.46k
Cortex-A72 32-bit 1.5GHz noAES (Pi 4B Raspbian):
aes-256-gcm      27712.91k    30615.40k    31924.65k    32312.51k    32328.36k
Cortex-A53 64-bit 1.4GHz AES (NanoPi T3 Ubuntu):
aes-256-gcm      85789.11k   216272.58k   363160.92k   449810.09k   480291.50k
Cortex-A57 64-bit 2GHz AES (Jetson TX2 Ubuntu):
aes-256-gcm     182532.45k   451431.66k   628317.10k   733904.90k   767257.26k
Denver-2 64-bit 1.8Ghz AES (Jetson TX2 Ubuntu):
aes-256-gcm     225900.97k   518809.43k   820897.28k   982568.28k  1050839.72k
The AES encryption algorithm is subject to various timing-related side-channel attacks. Therefore, it is possible that the current 64-bit ARM assembler in openssl works but was deemed unsafe and that is why it is currently disabled. Still, it would be interesting to know how much faster the hand-optimized assembler code runs.

pica200
Posts: 138
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 8:43 pm

That's quite a huge difference with acceleration and it's not even an A72. If the next SoC revision doesn't have that i'm gonna flip a table (╯°□°)╯︵ ┻━┻ :D

jdonald
Posts: 413
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Sat Aug 17, 2019 8:59 pm

ejolson wrote:
Sat Aug 17, 2019 7:43 pm
Therefore, by turning off NEON it appears you are already beating the 32-bit executable in Raspbian.
Interesting. ARMv6 lacks NEON altogether so that evens the playing field here. You're basically pointing out that in this case the 64-bit compiled C code beats out the 32-bit hand-tuned assembly.

For reference, here are the Debian armhf (ARMv7) Pi 4 numbers, default ARMV7_NEON on:

Code: Select all

(pi32)[email protected]:~ $ openssl speed -evp aes-256-gcm
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-gcm      41994.80k    48666.28k    51166.98k    54628.01k    55672.83k    55705.60k
so 38~48% faster than the tweaked 64-bit run.

Another test case is Debian armhf OpenSSL with ARMV7_NEON forced off (OPENSSL_armcap=0). It's only 3.5% faster than the ARMv6 code.
The AES encryption algorithm is subject to various timing-related side-channel attacks. Therefore, it is possible that the current 64-bit ARM assembler in openssl works but was deemed unsafe and that is why it is currently disabled.
Is compiled code necessarily more resistant against timing attacks? Or is using the AES ARMv8 extension? I see now it says at the top of the wiki, improves resistance to side-channel attacks.

The theory that OpenSSL developers haven't bothered because most other ARMv8 chips have AES extensions sounded quite plausible as well.
Last edited by jdonald on Sat Aug 17, 2019 10:11 pm, edited 2 times in total.

pica200
Posts: 138
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Sat Aug 17, 2019 9:27 pm

Another thing you can try is -ftree-vectorize for the C AES code but unsure if it will do anything. -ffast-math probably does nothing since that's for floating point mainly. -mneon-for-64bits may have an effect.

jerrm
Posts: 194
Joined: Wed May 02, 2018 7:35 pm

Re: Why moving to 64bit?

Sat Aug 17, 2019 10:14 pm

Setting OPENSSL_armcap=0 removes helps the Debian aarch64 https times as well:

Code: Select all

(pi64)[email protected]:~# time  OPENSSL_armcap=0 curl -k -o /dev/null https://192.168.22.109/ram/testfile.tst
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2001M  100 2001M    0     0  31.5M      0  0:01:03  0:01:03 --:--:-- 31.5M

real    1m3.905s
user    0m58.660s
sys     0m4.898s
50% or so better than Deb aarch64 default, and about 14% better than Raspbian userlands. Still pales compared to Debian armhf userland.
Last edited by jerrm on Sun Aug 18, 2019 7:34 pm, edited 1 time in total.

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sun Aug 18, 2019 7:26 pm

Heater wrote:
Fri Aug 16, 2019 5:07 pm
I pretty much agree with everyone's comments about security.
At first glance it might appear that fast AES encryption on a 64-bit platform is only useful for delivering secure web pages, file sharing and setting up virtual-private networks.

I was just reading here about the garbage collection algorithms used in the Go programming language. Currently a unique hash is computed using AES to avoid a write barrier that would otherwise slow performance. This works well because many modern CPUs have AES instructions in hardware. Note that in this usage case, there is no need for the AES computations to be resistant against side-channel attacks because they are not actually being used for security.

This surprising use of AES leads to the question whether Go programs in general suffer an observable performance penalty on the Pi running in 64-bit mode due to lack of sufficiently-optimised AES assembly routines. If so, it could be even more important to offer a Pi 4B with 8GB RAM as a bounty to fix such things.

Heater
Posts: 13360
Joined: Tue Jul 17, 2012 3:02 pm

Re: Why moving to 64bit?

Sun Aug 18, 2019 9:09 pm

ejolson,

Encryption hardware in garbage collection - that is one of the weirdest concepts I have heard for a long time.

Do they really mean no hash collisions possible? Or is it just that there might be one collision in all the runs of all all the Go programs ever in the age of the universe?

It's been a few years since I looked at Go. At the time I stopped looking when I noticed how stuttery Go was at handling some XML data streams we had going. So much so that I ended up using node.js to do the job, about the same speed but much easier to write.

I read around a bit and I found some article about Go's GC. Seemed it did not like 32 bit memory spaces much at the time. There was a bug report about it. Guess what we were using...

Looks like they have made a lot of progress since then. Time to go and give Go another go!
Memory in C++ is a leaky abstraction .

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23688
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why moving to 64bit?

Sun Aug 18, 2019 9:26 pm

ejolson wrote:
Sun Aug 18, 2019 7:26 pm
Heater wrote:
Fri Aug 16, 2019 5:07 pm
I pretty much agree with everyone's comments about security.
At first glance it might appear that fast AES encryption on a 64-bit platform is only useful for delivering secure web pages, file sharing and setting up virtual-private networks.

I was just reading here about the garbage collection algorithms used in the Go programming language. Currently a unique hash is computed using AES to avoid a write barrier that would otherwise slow performance. This works well because many modern CPUs have AES instructions in hardware. Note that in this usage case, there is no need for the AES computations to be resistant against side-channel attacks because they are not actually being used for security.

This surprising use of AES leads to the question whether Go programs in general suffer an observable performance penalty on the Pi running in 64-bit mode due to lack of sufficiently-optimised AES assembly routines. If so, it could be even more important to offer a Pi 4B with 8GB RAM as a bounty to fix such things.
What 8GB Pi4? There isn't one.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sun Aug 18, 2019 10:31 pm

jamesh wrote:
Sun Aug 18, 2019 9:26 pm
What 8GB Pi4? There isn't one.
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?

takyon
Posts: 36
Joined: Wed Jul 24, 2019 6:05 am

Re: Why moving to 64bit?

Sun Aug 18, 2019 10:34 pm

ejolson wrote:
Sun Aug 18, 2019 10:31 pm
jamesh wrote:
Sun Aug 18, 2019 9:26 pm
What 8GB Pi4? There isn't one.
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
They say that was a misprint in the user manual, and there are other reasons an 8GB version is unlikely.

I wouldn't be surprised if there isn't an 8GB model until RasPi 6.

Still, it's a good meme. The 8GBRasPi4B+ 👍

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23688
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why moving to 64bit?

Sun Aug 18, 2019 10:38 pm

ejolson wrote:
Sun Aug 18, 2019 10:31 pm
jamesh wrote:
Sun Aug 18, 2019 9:26 pm
What 8GB Pi4? There isn't one.
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
No 8GB Pi4 on my desk. Never seen one.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Sun Aug 18, 2019 10:38 pm

takyon wrote:
Sun Aug 18, 2019 10:34 pm
ejolson wrote:
Sun Aug 18, 2019 10:31 pm
jamesh wrote:
Sun Aug 18, 2019 9:26 pm
What 8GB Pi4? There isn't one.
I thought at least one was made in order to pass the compliance testing, which lists 8GB as being certified. Are you sure there isn't an 8GB model sitting on your desk?
They say that was a misprint in the user manual, and there are other reasons an 8GB version is unlikely.

I wouldn't be surprised if there isn't an 8GB model until RasPi 6.

Still, it's a good meme. The 8GBRasPi4B+ 👍
I think it makes for an even better bounty prize if it doesn't generally exist.

Andyroo
Posts: 4502
Joined: Sat Jun 16, 2018 12:49 am
Location: Lincs U.K.

Re: Why moving to 64bit?

Sun Aug 18, 2019 10:50 pm

jamesh wrote:
Sun Aug 18, 2019 10:38 pm
...
No 8GB Pi4 on my desk. Never seen one.
It’s locked in his drawer folks and he has only a keyboard / mouse and screen visible 8-)
Need Pi spray - these things are breeding in my house...

jdonald
Posts: 413
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Mon Aug 19, 2019 3:03 am

Here are some decompression benchmarks. While programs like gzip or lzma don't seem to have a standard built-in test like openssl, I get pretty consistent results with time unlzma -c filename.lzma > /dev/null.

Raspbian (ARMv6):

Code: Select all

[email protected]:~ $ time unlzma -c ~/70_mb_image.lzma -c > /dev/null

real	0m11.360s
user	0m11.276s
sys	0m0.084s

Debian armhf chroot (ARMv7 maximum compatibility):

Code: Select all

(pi32)[email protected]:~ $ time unlzma -c ~/70_mb_image.lzma -c > /dev/null

real	0m11.115s
user	0m11.037s
sys	0m0.076s

Debian armhf chroot Cortex-A72 tuned (ARMv7 -march=armv8-a+crc+simd -mtune=cortex-a72 -mfpu=neon-fp-armv8):

Code: Select all

(pi32)[email protected]:~/xz-utils-5.2.4/debian/normal-build $ time LD_LIBRARY_PATH=src/liblzma/.libs unlzma -c ~/70_mb_image.lzma -c > /dev/null

real	0m10.451s
user	0m10.401s
sys	0m0.048s

Debian arm64 chroot:

Code: Select all

(pi64)[email protected]:~ time unlzma -c ~/70_mb_image.lzma -c > /dev/null

real	0m9.451s
user	0m9.330s
sys	0m0.120s

Debian arm64 chroot Cortex-A72 tuned (ARMv8 -march=armv8-a+crc -mtune=cortex-a72):

Code: Select all

(pi64)[email protected]:~/xz-utils-5.2.4/debian/normal-build $ time LD_LIBRARY_PATH=src/liblzma/.libs unlzma -c ~pi/70_mb_image.lzma -c > /dev/null

real	0m9.348s
user	0m9.232s
sys	0m0.116s
So for LZMA decompression, going to ARMv7 alone improves performance by only 2%, Cortex-A72 tuning (still 32-bit) bumps that up to 9%, 64-bit out-of-the-box gets 19%, 64-bit tuned for the Pi 4 reaches 21%. (All speedup percentages relative to the ARMv6 baseline.) It doesn't seem to matter whether the test archive is 70 MB or a gigabyte.

This is in line with jerrm's earlier result where going from the Raspbian userland to 64-bit Debian took a zbackup restore from 103 minutes down to 91 minutes (13% speedup).

ejolson
Posts: 3580
Joined: Tue Mar 18, 2014 11:47 am

Re: Why moving to 64bit?

Thu Aug 22, 2019 1:25 am

Here are the openssl results for a Pi 4B running this 64-bit Gentoo image.

Code: Select all

$ openssl speed -evp aes-256-gcm
Doing aes-256-gcm for 3s on 16 size blocks: 6665953 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 64 size blocks: 1789838 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 456301 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 114937 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 8192 size blocks: 14443 aes-256-gcm's in 3.00s
OpenSSL 1.0.2s  28 May 2019
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(ptr) 
compiler: aarch64-unknown-linux-gnu-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -march=armv8-a+crc -mtune=cortex-a72 -O2 -pipe -fno-strict-aliasing -Wa,--noexecstack
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-gcm      35551.75k    38183.21k    38937.69k    39231.83k    39439.02k
I've added this result to the previous table and obtained

Code: Select all

ARM1176JZF-S 32-bit 700MHz noAES (Pi B+ Raspbian):
aes-256-gcm       5283.77k     6570.91k     6961.58k     7102.24k     7099.73k
Cortex-A53 32-bit 1.4GHz noAES (Pi 3B+ Raspbian):
aes-256-gcm      19046.13k    22753.49k    24055.64k    24405.67k    24510.46k
Cortex-A72 32-bit 1.5GHz noAES (Pi 4B Raspbian):
aes-256-gcm      27712.91k    30615.40k    31924.65k    32312.51k    32328.36k
Cortex-A72 64-bit 1.5GHz noAES (Pi 4B Gentoo):
aes-256-gcm      35551.75k    38183.21k    38937.69k    39231.83k    39439.02k
Cortex-A53 64-bit 1.4GHz AES (NanoPi T3 Ubuntu):
aes-256-gcm      85789.11k   216272.58k   363160.92k   449810.09k   480291.50k
Cortex-A57 64-bit 2GHz AES (Jetson TX2 Ubuntu):
aes-256-gcm     182532.45k   451431.66k   628317.10k   733904.90k   767257.26k
Denver-2 64-bit 1.8Ghz AES (Jetson TX2 Ubuntu):
aes-256-gcm     225900.97k   518809.43k   820897.28k   982568.28k  1050839.72k

jdonald
Posts: 413
Joined: Fri Nov 03, 2017 4:36 pm

Re: Why moving to 64bit?

Thu Aug 22, 2019 4:50 am

This result running 5% faster than Debian arm64 could be due to any of the following differences:
* -march=armv8-a+crc -mtune=cortex-a72
* gcc 9.1 (vs gcc 8.3)
* openssl 1.0.2s (vs 1.1.1c)

On Gentoo I no longer see any performance difference with OPENSSL_armcap=0. It appears the bug with ARMV7_NEON mysteriously causing a slowdown is not present in this version.

Of course it's still rather crippled by lacking a -DAES_ASM implementation, much slower than the Debian armhf configuration (gcc 8.3 and no -mtune).

Cob
Posts: 11
Joined: Tue Mar 05, 2013 2:03 am

Re: Why moving to 64bit?

Thu Aug 22, 2019 8:50 am

Manjaro 19.08 is also now available for the Rpi4 in 64bit, both in desktop and minimal flavours.

https://forum.manjaro.org/t/manjaro-arm ... ased/99031


Being based on Arch linux it's quite a lot easier to work with than Gentoo.

pica200
Posts: 138
Joined: Tue Aug 06, 2019 10:27 am

Re: Why moving to 64bit?

Thu Aug 22, 2019 10:19 am

This is what i have been using for the past week.And believe it or not Firefox was already working better without GPU driver than ESR on Raspian. And on top of that now that the GPU driver landed in the kernel (need to rename fbturbo config) Firefox is even working with acceleration force enabled while doing the same on Raspian with ESR crashed all tabs.

Now the disadvantage right now is the GPU driver doesn't give nearly as smooth graphics on 64 bits than on 32 bits because RPiF/T doesn't give a damn about the former. There is noticeable lag.

Return to “General discussion”