plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Thu Mar 31, 2016 4:56 pm

So I cooked up a small test program, basically do a 1048576 point FFT 128 times so I can time the result.

Since fftw3 uses in-program detection rather than ld hwcaps I needed to test on both the pi1 and the pi2, what shcoked me was that even without neon enabled and even given that this was a single threaded test the pi2 was about 3 times faster than the pi1.

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Sun Apr 03, 2016 3:06 pm

volk also seems to have had neon disabled, I'm not sure why I missed it in the list at the start of this thread.

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Sun Apr 03, 2016 3:10 pm

ok, looks like the only real user of volk is gnuradio, not sure if it's worth further effort or not.

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Sun Apr 03, 2016 3:13 pm

Further investigation shows that volk is stretch only.

Claggy
Posts: 16
Joined: Sun Jan 26, 2014 3:15 pm

Re: neon discussion, FAO shiftplusone

Tue Apr 05, 2016 10:30 pm

plugwash wrote:So I cooked up a small test program, basically do a 1048576 point FFT 128 times so I can time the result.

Since fftw3 uses in-program detection rather than ld hwcaps I needed to test on both the pi1 and the pi2, what shcoked me was that even without neon enabled and even given that this was a single threaded test the pi2 was about 3 times faster than the pi1.
I saw fftw 3.3.4 had been updated, and have rebuilt the setiathome v8 app at rev 3433, and am at present running a bench on my Pi 2 with the Lunatics Seti v8 test workunits,
When i get a spare moment, I'll do it on my Pi 3, and Pi 1B, althrough the Pi 1B still suffers from the 'Internal error: Oops - undefined instruction: 0 [#2] PREEMPT ARM' problem,

Claggy

Claggy
Posts: 16
Joined: Sun Jan 26, 2014 3:15 pm

Re: neon discussion, FAO shiftplusone

Wed Apr 06, 2016 11:02 pm

The Bench has eventually completed, there's an up to 7% increase in performance in using the Neon detecting fftw 3.3.4 (and fast mathes),
depending on the telescope Angle Range, the r3236 app didn't use fast mathes (I compiled it myself), the Stock 8.02 did i understand (compiled by someone else), and the r3433 one did:

KWSN-Linux-MBbench v2.1.08
Running on raspberrypi at Tue 05 Apr 2016 21:18:58 UTC
----------------------------------------------------------------
Starting benchmark run...
----------------------------------------------------------------
Listing wu-file(s) in /testWUs :
PG0009_v8.wu
PG0444_v8.wu
PG1327_v7.wu
PG1327_v8.wu

Listing executable(s) in /APPS :
setiathome_8.02_arm-unknown-linux-gnueabihf
setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf

Listing executable in /REF_APPS :
setiathome-8.0r3236.armv7l-unknown-linux-gnueabihf
----------------------------------------------------------------
Current WU: PG0009_v8.wu

----------------------------------------------------------------
Skipping default app setiathome-8.0r3236.armv7l-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 8076 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 8045.87 sec 7977.05 sec 54.32 sec
Elapsed Time : ...................... 8046 seconds
Speed compared to default : ......... 100 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.98%

----------------------------------------------------------------
Running app with command : .......... setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog
./setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog 7883.00 sec 7809.93 sec 55.79 sec
Elapsed Time : ...................... 7883 seconds
Speed compared to default : ......... 102 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.98%

----------------------------------------------------------------
Done with PG0009_v8.wu

====================================================================
Current WU: PG0444_v8.wu

----------------------------------------------------------------
Skipping default app setiathome-8.0r3236.armv7l-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 8874 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 8469.55 sec 8393.75 sec 54.89 sec
Elapsed Time : ...................... 8469 seconds
Speed compared to default : ......... 104 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.97%

----------------------------------------------------------------
Running app with command : .......... setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog
./setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog 8238.09 sec 8159.51 sec 53.89 sec
Elapsed Time : ...................... 8238 seconds
Speed compared to default : ......... 107 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.98%

----------------------------------------------------------------
Done with PG0444_v8.wu

====================================================================
Current WU: PG1327_v7.wu

----------------------------------------------------------------
Running default app with command :... setiathome-8.0r3236.armv7l-unknown-linux-gnueabihf -st -verb -nog
./setiathome-8.0r3236.armv7l-unknown-linux-gnueabihf -st -verb -nog 10602.89 sec 10453.22 sec 129.63 sec
Elapsed Time: ....................... 10603 seconds

----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 10473.90 sec 10320.13 sec 129.88 sec
Elapsed Time : ...................... 10474 seconds
Speed compared to default : ......... 101 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.95%

----------------------------------------------------------------
Running app with command : .......... setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog
./setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog 9905.77 sec 9753.34 sec 127.35 sec
Elapsed Time : ...................... 9906 seconds
Speed compared to default : ......... 107 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.95%

----------------------------------------------------------------
Done with PG1327_v7.wu

====================================================================
Current WU: PG1327_v8.wu

----------------------------------------------------------------
Skipping default app setiathome-8.0r3236.armv7l-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 10414 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 10410.35 sec 10251.57 sec 129.66 sec
Elapsed Time : ...................... 10411 seconds
Speed compared to default : ......... 100 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.95%

----------------------------------------------------------------
Running app with command : .......... setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog
./setiathome-8.0r3433.armv7l-unknown-linux-gnueabihf -st -verb -nog 10076.89 sec 9925.13 sec 129.40 sec
Elapsed Time : ...................... 10077 seconds
Speed compared to default : ......... 103 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.95%

----------------------------------------------------------------
Done with PG1327_v8.wu

====================================================================
Hosts CPU data ...
model name : ARMv7 Processor rev 5 (v7l)

Done with Benchmark run! Removing temporary files!

Thanks. :)

Claggy

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Thu Apr 07, 2016 5:57 pm

libvpx went smoothly, shuold hit the repo in the next update run.

Rascas
Posts: 681
Joined: Tue Mar 11, 2014 6:18 pm
Location: Porto, Portugal
Contact: Website

Re: neon discussion, FAO shiftplusone

Wed Apr 13, 2016 8:48 am

Can you do the same for libjpeg-turbo ?

cjan
Posts: 843
Joined: Sun May 06, 2012 12:00 am

Re: neon discussion, FAO shiftplusone

Thu Apr 14, 2016 10:11 pm

plugwash wrote:libvpx went smoothly, shuold hit the repo in the next update run.
after 2 times update, did not have libvpx update?

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Fri Apr 15, 2016 12:47 am

hmm, what version do you have?

cjan
Posts: 843
Joined: Sun May 06, 2012 12:00 am

Re: neon discussion, FAO shiftplusone

Fri Apr 15, 2016 12:54 am

plugwash wrote:hmm, what version do you have?
libvpx-dev_1.3.0-3+rvt

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Fri Apr 15, 2016 1:06 am

Thats the updated version.

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Thu Apr 21, 2016 2:16 pm

Rascas wrote:Can you do the same for libjpeg-turbo ?
We never disabled neon in libjpeg-turbo (of course that doesn't mean it's working, but if it's not working it's not because we disabled it).

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Thu May 12, 2016 8:15 pm

Investigating stretch's ffmpeg it seems to already be using internal neon detection (no seperate neon flavour) and to be performing better than the neon-enabled version of wheezy's libav. I conclude that no action is needed there.

plugwash
Forum Moderator
Forum Moderator
Posts: 3614
Joined: Wed Dec 28, 2011 11:45 pm

Re: neon discussion, FAO shiftplusone

Thu May 12, 2016 9:15 pm

Just forward ported the x264 changes to stretch, it is using ld.so hwcaps so I don't think there is any need to run tests and will be uploading straight away.

Return to “Raspberry Pi OS”