-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Raspberry Pi Benchmarks
I have converted my Classic Benchmarks to run on the Linux based Raspberry Pi. These are Whetstone, Dhrystone, Linpack and Livermore Loops. The benchmark source code was also compiled and on a PC running Linux to confirm compatibility. Executable files and source codes can be downloaded by the following. This can be unzipped on the Pi and programs run from a Terminal command.
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
For details and example results on ARM and Intel processors see.
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Speeds from the ARM A6 are not as fast as those from ARM Cortex-A9 on recent Android tablets but, as we know, the Pi is usable as a proper desktop computer running Linux.
The Livermore Loops benchmark was used to accept the first supercomputer. So the main bragging rights are:
In 1978, the Cray 1 supercomputer cost $7 Million, weighed 10,500 pounds and had a 115 kilowatt power supply. It was, by far, the fastest computer in the world. The Raspberry Pi costs around $70 (CPU board, case, power supply, SD card), weighs a few ounces, uses a 5 watt power supply and is more than 4.5 times faster than the Cray 1.
My bragging rights are that I developed and ran benchmarks, including Whetstones, on Serial 1 Cray 1.
Roy
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
For details and example results on ARM and Intel processors see.
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Speeds from the ARM A6 are not as fast as those from ARM Cortex-A9 on recent Android tablets but, as we know, the Pi is usable as a proper desktop computer running Linux.
The Livermore Loops benchmark was used to accept the first supercomputer. So the main bragging rights are:
In 1978, the Cray 1 supercomputer cost $7 Million, weighed 10,500 pounds and had a 115 kilowatt power supply. It was, by far, the fastest computer in the world. The Raspberry Pi costs around $70 (CPU board, case, power supply, SD card), weighs a few ounces, uses a 5 watt power supply and is more than 4.5 times faster than the Cray 1.
My bragging rights are that I developed and ran benchmarks, including Whetstones, on Serial 1 Cray 1.
Roy
Re: Raspberry Pi Benchmarks
Excellent and detailed information. I wonder how other similar boards (in price) do in the same tests, like the BeagleBlack, cheap tp-link routers and pcDuinos, etc
My website: www.ried.cl
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
Classic Benchmarks Stability Test
A long time ago, the Livermore Loops benchmark produced wrong numeric results on an overclocked CPU. So, a reliability mode was included with an execution parameter to control running time and addition checks for the correct result of calculations. I ran a 6 minutes test at 700 MHz and 1000 MHz, measuring temperatures from a different Terminal. Detailed results are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Room temperature was 22.6°C. At 700 MHz temperature increased from 48.7 to 53.0°C and, higher at 1000 MHz, from 50.3 to 60.5°C.
Roy
A long time ago, the Livermore Loops benchmark produced wrong numeric results on an overclocked CPU. So, a reliability mode was included with an execution parameter to control running time and addition checks for the correct result of calculations. I ran a 6 minutes test at 700 MHz and 1000 MHz, measuring temperatures from a different Terminal. Detailed results are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Room temperature was 22.6°C. At 700 MHz temperature increased from 48.7 to 53.0°C and, higher at 1000 MHz, from 50.3 to 60.5°C.
Roy
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
My memory speed benchmark has now been converted to run on the Raspberry Pi with the program and source code included in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
MemSpeed benchmark measures data reading speeds in MegaBytes per second carrying out calculations on arrays of cache and RAM data, using double and single precision floating point and integer numbers. Calculations are of the form x = x + s * y (integer s + y), x = x + y and x = y. Results are below at normal and overclocked speeds. The overheads on repetitively running the tests cause variations in speeds of the lower data sizes but average overclocked speed gain, using L1 cache, is 1.41 times, compared with 1.43 times CPU MHz. Average RAM speed gains are 1.53 times, similar to expectations. A surprise is for L2 cache based data, where the average gain is 1.72 times and some speeds appear to be faster than using L1 cache. For comparison with Intel processors and other ARM based system see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Roy
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
MemSpeed benchmark measures data reading speeds in MegaBytes per second carrying out calculations on arrays of cache and RAM data, using double and single precision floating point and integer numbers. Calculations are of the form x = x + s * y (integer s + y), x = x + y and x = y. Results are below at normal and overclocked speeds. The overheads on repetitively running the tests cause variations in speeds of the lower data sizes but average overclocked speed gain, using L1 cache, is 1.41 times, compared with 1.43 times CPU MHz. Average RAM speed gains are 1.53 times, similar to expectations. A surprise is for L2 cache based data, where the average gain is 1.72 times and some speeds appear to be faster than using L1 cache. For comparison with Intel processors and other ARM based system see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Code: Select all
Results in MegaBytes Per Second
Raspberry Pi CPU 700 MHz, Core 400 MHz, SDRAM 400 MHz
Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m] x[m]=y[m]
KBytes Dble Sngl Int32 Dble Sngl Int32 Dble Sngl Int32
Used MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S MB/S
8 538 640 930 602 731 1094 1230 465 465 L1
16 568 602 787 602 731 1023 1000 426 507
32 292 256 310 276 262 330 1066 426 547 L2
64 276 238 276 262 238 292 341 269 284
128 189 170 193 182 170 200 222 196 204
256 140 129 142 136 129 144 138 119 124 RAM
512 138 127 138 134 127 144 131 111 119
1024 136 127 138 134 127 144 124 111 119
2048 136 127 138 132 128 144 128 111 121
4096 136 128 138 134 126 144 128 111 119
8192 138 127 138 136 127 144 126 111 119
Max MFLOPS 71 160 38 91
MAX MIPS 725 645 423 320 320
Raspberry Pi CPU 1000 MHz, Core 500 MHz, SDRAM 600 MHz, 6 volts
8 602 640 1185 930 1163 1662 1422 511 761 L1
16 787 930 1292 853 1023 1523 1777 537 761
32 487 426 487 465 426 568 1939 820 1142 L2
64 465 393 465 426 393 511 592 457 508
128 330 310 341 320 301 365 341 301 341
256 208 200 213 204 200 217 196 170 189 RAM
512 204 200 213 200 200 213 196 176 182
1024 213 200 208 200 200 217 196 170 182
2048 204 196 213 204 200 217 196 170 182
4096 204 200 213 200 200 217 196 170 182
8192 204 200 213 200 200 218 204 169 182
Max MFLOPS 98 232 58 145
MAX MIPS 1007 980 667 563 785
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
Latest conversion is BusSpeed with the benchmark and source code included in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
This benchmark is designed to identify reading data in bursts over buses, with measurements covering caches and RAM. The program starts by reading a word (4 bytes) with an address increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read.
Following shows results on the Pi with default speed settings. Variations indicate that data is read in bursts from caches and RAM. For the latter, 8 word or 32 byte bursts are suggested or 8 transfers of 4 bytes. An estimate of maximum speed is 8 x 34 = 272 MB/second, a long way from the specification. It is the same story on Andoid/ARM based devices but throughput can increase linearly with the number of CPU cores being used. Intel based PCs do not suffer to nowhere near the same extent (but at what cost?). For further details and comparisons see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Roy
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
This benchmark is designed to identify reading data in bursts over buses, with measurements covering caches and RAM. The program starts by reading a word (4 bytes) with an address increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read.
Following shows results on the Pi with default speed settings. Variations indicate that data is read in bursts from caches and RAM. For the latter, 8 word or 32 byte bursts are suggested or 8 transfers of 4 bytes. An estimate of maximum speed is 8 x 34 = 272 MB/second, a long way from the specification. It is the same story on Andoid/ARM based devices but throughput can increase linearly with the number of CPU cores being used. Intel based PCs do not suffer to nowhere near the same extent (but at what cost?). For further details and comparisons see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Code: Select all
Raspberry Pi CPU 700 MHz, Core 400 MHz, SDRAM 400 MHz
Maximum speed 400 x 2 (DDR) x 4 Width = 3.2 GB/sec
BusSpeed 32 Bit V1.1 Wed May 22 15:28:01 2013
Reading Speed 4 Byte Words in MBytes/Second
Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read
KBytes Words Words Words Words Words All
16 290 304 568 984 1125 1142 L1
32 133 116 131 133 225 465 L2
64 116 98 116 109 192 409
128 60 54 62 68 126 273
256 34 34 34 43 88 192 RAM
512 34 34 34 45 91 200
1024 34 31 34 45 91 181
4096 32 33 33 45 87 183
16384 32 32 34 44 83 186
65536 34 32 34 44 88 186
End of test Wed May 22 15:28:13 2013
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
The newest conversion is a Java version of the Whetstone Benchmark. For details of both of these see latest:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
This benchmark is heavily dependent on floating point calculations. As many will know, the Java compiler produces hardware independent .class files that need a Java RunTime Environment (JRE) to convert this to runnable on specific hardware. Programs can be compiled to run off-line, and with small changes, to run on-line as Applets via an HTML page. This Whetstone Benchmark is provided in both formats.
Initial research suggested installing JRE 6 (icedtea-plugin). This ran the benchmarks, but slowly. That also applied when JRE 7 was installed. Further searching indicated that JRE 8 was needed for hard float and this was much faster. Finally, the following provided ways of making JRE 6 and 7 faster (using JamVM):
http://www.raspberrypi.org/phpBB3/viewt ... 81&t=45132
Following are results of the 8 different tests and overall MWIPS rating, including using maximum overclocking. There is not much difference in overall MWIPS on using JamVM and the original is faster on the COS/EXP tests, but much slower on the other tests. Similarly, JRE 8 averages about four times faster than JRE 7 with JamVM, if the COS/EXP results are excluded. The moral is don’t compare performance using a single number.
Roy
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
This benchmark is heavily dependent on floating point calculations. As many will know, the Java compiler produces hardware independent .class files that need a Java RunTime Environment (JRE) to convert this to runnable on specific hardware. Programs can be compiled to run off-line, and with small changes, to run on-line as Applets via an HTML page. This Whetstone Benchmark is provided in both formats.
Initial research suggested installing JRE 6 (icedtea-plugin). This ran the benchmarks, but slowly. That also applied when JRE 7 was installed. Further searching indicated that JRE 8 was needed for hard float and this was much faster. Finally, the following provided ways of making JRE 6 and 7 faster (using JamVM):
http://www.raspberrypi.org/phpBB3/viewt ... 81&t=45132
Following are results of the 8 different tests and overall MWIPS rating, including using maximum overclocking. There is not much difference in overall MWIPS on using JamVM and the original is faster on the COS/EXP tests, but much slower on the other tests. Similarly, JRE 8 averages about four times faster than JRE 7 with JamVM, if the COS/EXP results are excluded. The moral is don’t compare performance using a single number.
Code: Select all
Version JRE MWIPS ------MFLOPS------- -------------MOPS---------------
1 2 3 COS EXP FIXPT IF EQUAL
Off-line
Original 6 18.3 4.4 6.4 3.3 0.99 0.30 7.9 2.9 2.5
Original 7 18.7 4.3 6.2 3.5 0.98 0.30 7.9 2.9 2.6
JamVM 6 23.4 9.4 10.0 8.9 0.69 0.23 17.8 8.1 5.4
JamVM 7 25.7 12.3 11.7 9.7 0.74 0.24 23.6 10.9 6.2
Original 8 47.8 49.4 47.8 26.7 1.19 0.36 93.3 27.8 40.0
1000 MHz 8 75.1 71.4 69.2 39.8 2.10 0.53 134.9 40.3 57.8
On-line
JamVM 6 25.3 9.9 10.3 7.7 0.63 0.25 17.9 8.5 5.1
1000 MHz 6 39.0 14.1 13.7 11.2 1.30 0.40 25.4 12.2 7.2
MHz
C Version 700 270.5 97.8 100.8 85.7 5.90 2.70 425.3 698.6 499.0
V7-A9 Java 1000 286.5 51.8 85.2 63.6 13.10 5.30 176.1 68.6 35.0
Re: Raspberry Pi Benchmarks
Cacao JVM is very fast compared to the default one. Can you compare that jvm too? (there are like 3 available jvms but cacao is the fastest in my tests, almost 2x more speed)
My website: www.ried.cl
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
How do I install/enable Cacao - I tried a suggested command earlier but it was not recognised.eried wrote:Cacao JVM is very fast compared to the default one. Can you compare that jvm too? (there are like 3 available jvms but cacao is the fastest in my tests, almost 2x more speed)
Re: Raspberry Pi Benchmarks
Check apt-cache search jvm | grep cacaoRoyLongbottom wrote:How do I install/enable Cacao - I tried a suggested command earlier but it was not recognised.eried wrote:Cacao JVM is very fast compared to the default one. Can you compare that jvm too? (there are like 3 available jvms but cacao is the fastest in my tests, almost 2x more speed)
I tested a image recognizer:
Code: Select all
pi@raspberrypi ~/Desktop/Mono/MatriculasTest2 $ time java -jar javaanpr.jar -recognize -i test_001.jpg
PP587A0
real 0m31.387s
user 0m30.670s
sys 0m0.290s
With jamvm:
pi@raspberrypi ~/Desktop/Mono/MatriculasTest2 $ time java -jamvm -jar javaanpr.jar -recognize -i test_001.jpg
PP587A0
real 0m18.316s
user 0m17.750s
sys 0m0.220s
With cacao vm:
pi@raspberrypi ~/Desktop/Mono/MatriculasTest2 $ time java -cacao -jar javaanpr.jar -recognize -i test_001.jpg
PP587A0
real 0m14.687s
user 0m11.700s
sys 0m0.750s
With server vm:
pi@raspberrypi ~/Desktop/Mono/MatriculasTest2 $ time java -server -jar javaanpr.jar -recognize -i test_001.jpg
PP587A0
real 0m28.616s
user 0m27.910s
sys 0m0.290s

My website: www.ried.cl
Re: Raspberry Pi Benchmarks
This is probably down to the cache not having a stride detecting prefetcher on L2 cache. Any ordered traversal of memory benchmark would benefit from that.RoyLongbottom wrote:Latest conversion is BusSpeed with the benchmark and source code included in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
This benchmark is designed to identify reading data in bursts over buses, with measurements covering caches and RAM. The program starts by reading a word (4 bytes) with an address increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read.
Following shows results on the Pi with default speed settings. Variations indicate that data is read in bursts from caches and RAM. For the latter, 8 word or 32 byte bursts are suggested or 8 transfers of 4 bytes. An estimate of maximum speed is 8 x 34 = 272 MB/second, a long way from the specification. It is the same story on Andoid/ARM based devices but throughput can increase linearly with the number of CPU cores being used. Intel based PCs do not suffer to nowhere near the same extent (but at what cost?). For further details and comparisons see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
RoyCode: Select all
Raspberry Pi CPU 700 MHz, Core 400 MHz, SDRAM 400 MHz Maximum speed 400 x 2 (DDR) x 4 Width = 3.2 GB/sec BusSpeed 32 Bit V1.1 Wed May 22 15:28:01 2013 Reading Speed 4 Byte Words in MBytes/Second Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read KBytes Words Words Words Words Words All 16 290 304 568 984 1125 1142 L1 32 133 116 131 133 225 465 L2 64 116 98 116 109 192 409 128 60 54 62 68 126 273 256 34 34 34 43 88 192 RAM 512 34 34 34 45 91 200 1024 34 31 34 45 91 181 4096 32 33 33 45 87 183 16384 32 32 34 44 83 186 65536 34 32 34 44 88 186 End of test Wed May 22 15:28:13 2013
Note that we have optimised the various memxxx() functions to the Armv6 architecture, which would be worth experimenting with.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
Herewith results. Yes, Cacao is much faster than JamVM, except for the COS/EXP tests, but not as fast as JRE 8.eried wrote:Cacao JVM is very fast compared to the default one. Can you compare that jvm too? (there are like 3 available jvms but cacao is the fastest in my tests, almost 2x more speed)
Code: Select all
Version JRE MWIPS ------MFLOPS------- -------------MOPS---------------
1 2 3 COS EXP FIXPT IF EQUAL
Original 6&7 18.3 4.4 6.4 3.3 0.99 0.30 7.9 2.9 2.5
JamVM 6 23.4 9.4 10.0 8.9 0.69 0.23 17.8 8.1 5.4
Cacao 6 32.7 25.5 36.7 25.7 0.76 0.24 55.2 28.9 25.8
Original 8 47.8 49.4 47.8 26.7 1.19 0.36 93.3 27.8 40.0
Re: Raspberry Pi Benchmarks
Excellent! JRE8 was installed from official instructions right? is the 'oracle' official?RoyLongbottom wrote:Herewith results. Yes, Cacao is much faster than JamVM, except for the COS/EXP tests, but not as fast as JRE 8.eried wrote:Cacao JVM is very fast compared to the default one. Can you compare that jvm too? (there are like 3 available jvms but cacao is the fastest in my tests, almost 2x more speed)
RoyCode: Select all
Version JRE MWIPS ------MFLOPS------- -------------MOPS--------------- 1 2 3 COS EXP FIXPT IF EQUAL Original 6&7 18.3 4.4 6.4 3.3 0.99 0.30 7.9 2.9 2.5 JamVM 6 23.4 9.4 10.0 8.9 0.69 0.23 17.8 8.1 5.4 Cacao 6 32.7 25.5 36.7 25.7 0.76 0.24 55.2 28.9 25.8 Original 8 47.8 49.4 47.8 26.7 1.19 0.36 93.3 27.8 40.0
My website: www.ried.cl
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
> Excellent! JRE8 was installed from official instructions right? is the 'oracle' official?<
JRE 8 was installed using commands provided in the following:
http://raspberrypi.stackexchange.com/qu ... spberry-pi
I was a little concerned about the License Agreement which seemed to imply that results should not be published.
JRE 8 was installed using commands provided in the following:
http://raspberrypi.stackexchange.com/qu ... spberry-pi
I was a little concerned about the License Agreement which seemed to imply that results should not be published.
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
JavaDraw is the second Java benchmark arranged to run on the Raspberry Pi. It comes in two flavours to run off-line. The first, JavaDrawPC.class (use command java JavaDrawPC), was produced via Linux Ubuntu and JDK 6. The second, JavaDrawPi.class, was compiled on the Pi using JDK 7 (command javac JavaDrawPi.java). The JDK 6 version can be run (so far) via Windows, Linux and Raspbian using any JRE but the Pi compilation, at least, will not run using Ubuntu with JRE 6. The benchmarks and source code are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
The benchmark uses small to rather excessive simple objects to measure drawing performance. The source code that does the drawing is only 125 lines and might be of interest for others to play with.
There are five tests. The first loads two PNG files and moves them about. This is repeated as slow startup time produces misleading performance. The second test adds two moving multi-coloured circles, again from PNG files. The next tests add 200 random small circles (blobs), then 320 long lines, with the last test filling the screen with 4000 random blobs.
Results are below via various JREs, where, at least, the Pi can be as fast as a Nexus 7 tablet. Note that JRE 8 is particularly slow.
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Roy
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
The benchmark uses small to rather excessive simple objects to measure drawing performance. The source code that does the drawing is only 125 lines and might be of interest for others to play with.
There are five tests. The first loads two PNG files and moves them about. This is repeated as slow startup time produces misleading performance. The second test adds two moving multi-coloured circles, again from PNG files. The next tests add 200 random small circles (blobs), then 320 long lines, with the last test filling the screen with 4000 random blobs.
Results are below via various JREs, where, at least, the Pi can be as fast as a Nexus 7 tablet. Note that JRE 8 is particularly slow.
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Code: Select all
Speed in Frames Per Second (FPS)
PNG PNG +Sweep +200 +320 +4000
Bitmaps Bitmaps Gradient Small Long Small
JRE 1 2 Circles Circles Lines Circles
Pi 700 MHz 6 3.6 12.0 11.9 9.5 5.5 1.8
Pi 700 MHz 6 Cac 0.2 13.6 14.7 12.4 7.2 2.8
Pi 700 MHz 7 2.4 11.7 11.6 9.5 5.6 1.9
Pi 700 MHz 8 0.4 2.7 2.6 1.9 0.8 0.4
Pi 1000 MHz 6 10.1 19.5 19.3 15.9 9.4 3.1
Pi 1000 MHz 6 Cac 8.3 23.1 21.9 18.2 11.0 4.2
Pi 1000 MHz 7 11.1 19.2 18.7 16.2 9.5 3.1
Pi 1000 MHz 8 2.0 4.3 4.2 3.0 1.3 0.6
Nexus 7 1300 MHz 20.4 16.5 14.5 11.3 3.8
Atom 1666 MHz 57.3 83.2 80.1 74.8 53.6 24.5
Core 2 2400 MHz 271.5 360.6 227.7 237.6 205.2 142.5
Cac = Cacao VM
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
I have now converted my Android OpenGL benchmark to run on RPi. As usual, the benchmarks and source code are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
The benchmark has four tests - wireframe, shaded, more objects shaded, textured. Four loadings are run with 900, 9000, 18000 and 36000 triangles drawn. The run command allows different window sizes to be used. For results using these and further details of the benchmark, see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Also provided are some Android Java based results, where the RPi appears to be as fast as the Nexus 7 tablet. An example of RPi results at default MHz follow. According to many posts, it seems that the RPi does not implement VSYNC that would limit speed at 60 FPS. However, further tests using an absolute minimum of triangles could not better 100 FPS. so that looks like the minimum overhead on using eglSwapBuffers. Without the latter (and without any picture displayed) 370 FPS was produced (see results).
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
The benchmark has four tests - wireframe, shaded, more objects shaded, textured. Four loadings are run with 900, 9000, 18000 and 36000 triangles drawn. The run command allows different window sizes to be used. For results using these and further details of the benchmark, see:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Also provided are some Android Java based results, where the RPi appears to be as fast as the Nexus 7 tablet. An example of RPi results at default MHz follow. According to many posts, it seems that the RPi does not implement VSYNC that would limit speed at 60 FPS. However, further tests using an absolute minimum of triangles could not better 100 FPS. so that looks like the minimum overhead on using eglSwapBuffers. Without the latter (and without any picture displayed) 370 FPS was produced (see results).
Code: Select all
Raspberry Pi OpenGL ES Benchmark 1.1, Thu Jun 13 16:04:35 2013
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
900+ 100.22 99.97 86.96 78.91
9000+ 40.04 39.85 29.94 23.28
18000+ 20.27 20.24 17.32 12.82
36000+ 10.19 10.18 9.38 6.88
Screen Pixels 1280 Wide 720 High
End Time Thu Jun 13 16:07:16 2013
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
As reported in the following post, I have produced a simple RPi temperature recording program for use with my benchmarks for testing system stability/reliability.
http://www.raspberrypi.org/phpBB3/viewt ... 31&t=47469
I have added run time parameters to my OpenGL benchmark to execute the most exhaustive tests for extended periods and have run this, along with my Livermore Loops Stability tests to see what happens on overclocking. Note that OpenGL by itself generated only about 50% CPU utilisation, providing scope for multitasking. On the day (cooler than yesterday), maximum temperature running the OpenGL program, for 15 minutes at 700 MHz, was 67°C, increasing to 69.7°C by also running the Livermore Loops. With maximum 1 GHz overclocking and OpenGL, 69.1°C was reached.
With OpenGL and Livermore Loops running on the overclocked RPi, it bombed out after 75 seconds, with an illegal instruction and recorded temperature of 72.9°C. Restarting OpenGL repeated an earlier experience by freezing the display. Details can be found in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
and the temperature recording program, along with benchmarks, in the usual
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
http://www.raspberrypi.org/phpBB3/viewt ... 31&t=47469
I have added run time parameters to my OpenGL benchmark to execute the most exhaustive tests for extended periods and have run this, along with my Livermore Loops Stability tests to see what happens on overclocking. Note that OpenGL by itself generated only about 50% CPU utilisation, providing scope for multitasking. On the day (cooler than yesterday), maximum temperature running the OpenGL program, for 15 minutes at 700 MHz, was 67°C, increasing to 69.7°C by also running the Livermore Loops. With maximum 1 GHz overclocking and OpenGL, 69.1°C was reached.
With OpenGL and Livermore Loops running on the overclocked RPi, it bombed out after 75 seconds, with an illegal instruction and recorded temperature of 72.9°C. Restarting OpenGL repeated an earlier experience by freezing the display. Details can be found in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
and the temperature recording program, along with benchmarks, in the usual
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
I have now converted my DriveSpeed benchmark to run on the Raspberry Pi. It can mearure performance of the main drive plus any USB attached SD cards, USB sticks and disk drives. There are four test procedures:
Example results for the main SD drive are below. A link to the source and execution files, results on external devices, along with details of how to set up the directory paths, are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Code: Select all
Test 1 - Write and read three 8 and 16 MB; Results given in MBytes/second
Test 2 - Write 8 MB, read can be cached in RAM; Results given in MBytes/second
Test 3 - Random write and read 1 KB from 4 to 16 MB; Results are Average time in milliseconds
Test 4 - Write and read 200 files 4 KB to 16 KB; Results in MB/sec, msecs/file and delete seconds.
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Code: Select all
#####################################################
DriveSpeed RasPi 1.0 Tue Jun 25 14:01:02 2013
Current Directory Path: /home/pi/benchmarks/drivespeed
Total MB 14685, Free MB 12737, Used MB 1948
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 9.10 9.15 9.14 22.68 22.73 22.77
16 9.94 10.26 10.25 22.75 22.78 22.80
Cached
8 43.35 37.41 58.10 157.41 156.62 157.34
Random Read Write
From MB 4 8 16 4 8 16
msecs 0.019 0.019 0.043 9.07 15.32 9.45
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
MB/sec 0.09 0.11 0.16 4.31 7.49 11.40
ms/file 46.73 72.37 105.14 0.95 1.09 1.44 0.039
End of test Tue Jun 25 14:02:17 2013
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
LAN and WiFi Benchmark Programs
The latest benchmark is a variation of the last one (DriveSpeed) designed to measure performance on passing data over Local Area Networks (LANs) and WiFi. There are three programs to do the job, LanSpeed that runs on the Raspberry Pi/ARM CPU, LanSpdx86Lin for Intel/Linux PCs and LanSpdx86Win for Intel/Windows systems. They are all included in
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
In this case, Linux PCs, Windows PCs and the RasPi are connected via a Router, also providing WiFi access, and via a Windows Workgroup set up. For each system, different procedures or commands are needed to set up the links, log on and enable access. Details of these and various results are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Besides demonstrating likely maximum data transfer speeds, the benchmark programs show that there can be significant performance differences, depending on which system is controlling the transfer. Following shows RPi to Windows, Windows to RPi and RPi to Linux, Linux to RPi MBytes per second, writing and reading 16 MB files.
Then average times for writing and reading small files of 4 KB, 8 KB and 16 KB:
Other results provided are for random access, local and distant WiFi, and Gigabit LAN.
The latest benchmark is a variation of the last one (DriveSpeed) designed to measure performance on passing data over Local Area Networks (LANs) and WiFi. There are three programs to do the job, LanSpeed that runs on the Raspberry Pi/ARM CPU, LanSpdx86Lin for Intel/Linux PCs and LanSpdx86Win for Intel/Windows systems. They are all included in
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
In this case, Linux PCs, Windows PCs and the RasPi are connected via a Router, also providing WiFi access, and via a Windows Workgroup set up. For each system, different procedures or commands are needed to set up the links, log on and enable access. Details of these and various results are in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Besides demonstrating likely maximum data transfer speeds, the benchmark programs show that there can be significant performance differences, depending on which system is controlling the transfer. Following shows RPi to Windows, Windows to RPi and RPi to Linux, Linux to RPi MBytes per second, writing and reading 16 MB files.
Code: Select all
Source Dest MBytes/Second
CPU CPU/drive MB Write1 Write2 Write3 Read1 Read2 Read3
Rpi Ph Win 16 7.29 8.13 6.78 11.53 11.60 11.58
Ph Win Rpi 16 11.29 11.18 10.70 4.22 2.70 1.97
Rpi C2 Lin 16 7.79 7.52 7.84 11.62 11.61 11.66
C2 Lin Rpi 16 6.53 6.36 6.23 5.58 5.49 6.01
Code: Select all
milliseconds per file
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
Rpi Ph Win 6.63 6.83 7.68 3.88 5.07 5.98 0.28
Ph Win Rpi 14.15 14.21 15.76 10.32 10.52 11.47 1.79
Rpi C2 Lin 5.74 6.83 8.96 4.87 5.97 6.74 0.60
C2 Lin Rpi 9.87 10.55 11.73 7.13 7.52 8.44 1.30
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
The latest additions are MultiThreading Benchmarks, essentially the same as my Android progams described, along with results, in:
http://www.roylongbottom.org.uk/android ... hmarks.htm
All run the benchmarks using 1, 2, 4 and 8 threads. Those that use caches and RAM have data sizes around 12.8 KB, 128 KB and 12.8 MB. Further details and results can be found in
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
These will be of more use if a Raspberry Pi appears with more than one CPU core. The benchmarks and source codes are available in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
The latter also includes identical code compiled for Intel compatible processors running under Linux.
MP-MFLOPS - measures floating point speed on data from caches and RAM. The first calculations are as used in MemSpeed. Others use more calculations on each data word. Each thread carries out the same calculations but accesses different segments of the data. The result, on cache based calculations, is often performance proportional to the number of cores used.
MP-Whetstone - Multiple threads each run the eight test functions at the same time, but with some dedicated variables. Measured speed is based on the last thread to finish, with Mutex functions, used to avoid the updating conflict by only allowing one thread at a time to access common data. Again performance is generally proportional to the number of cores used. There can be some significant differences from the single CPU Whetstone benchmark results on particular tests due to a different compiler being used.
MP-Dhrystone - This runs multiple copies of the whole program at the same time. Dedicated data arrays are used for each thread but there are numerous other variables that are shared. The latter reduces performance gains via multiple threads and, in some cases, these can be slower than using a single thread.
MP-BusSpeed - This runs integer read only tests using caches and RAM, each thread accessing the same data sequentially. To start with, data is read with large address increments to demonstrate burst data transfers. Performance gains, using L1 cache, can be proportional to the number of cores, but not quite so using L2. The program is designed to produce maximum throughput over buses and demonstrates the fastest RAM speeds using multiple cores.
MP-RandMem - The benchmark has cache and RAM read only and read/write tests using sequential and random access, each thread accessing the same data but starting at different points. It uses the Mutex functions as in Whetstone above, sometimes leading to no performance gains using multiple threads. Random access is also demonstrated as being relatively slow where burst data transfers are involved.
http://www.roylongbottom.org.uk/android ... hmarks.htm
All run the benchmarks using 1, 2, 4 and 8 threads. Those that use caches and RAM have data sizes around 12.8 KB, 128 KB and 12.8 MB. Further details and results can be found in
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
These will be of more use if a Raspberry Pi appears with more than one CPU core. The benchmarks and source codes are available in:
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
The latter also includes identical code compiled for Intel compatible processors running under Linux.
MP-MFLOPS - measures floating point speed on data from caches and RAM. The first calculations are as used in MemSpeed. Others use more calculations on each data word. Each thread carries out the same calculations but accesses different segments of the data. The result, on cache based calculations, is often performance proportional to the number of cores used.
MP-Whetstone - Multiple threads each run the eight test functions at the same time, but with some dedicated variables. Measured speed is based on the last thread to finish, with Mutex functions, used to avoid the updating conflict by only allowing one thread at a time to access common data. Again performance is generally proportional to the number of cores used. There can be some significant differences from the single CPU Whetstone benchmark results on particular tests due to a different compiler being used.
MP-Dhrystone - This runs multiple copies of the whole program at the same time. Dedicated data arrays are used for each thread but there are numerous other variables that are shared. The latter reduces performance gains via multiple threads and, in some cases, these can be slower than using a single thread.
MP-BusSpeed - This runs integer read only tests using caches and RAM, each thread accessing the same data sequentially. To start with, data is read with large address increments to demonstrate burst data transfers. Performance gains, using L1 cache, can be proportional to the number of cores, but not quite so using L2. The program is designed to produce maximum throughput over buses and demonstrates the fastest RAM speeds using multiple cores.
MP-RandMem - The benchmark has cache and RAM read only and read/write tests using sequential and random access, each thread accessing the same data but starting at different points. It uses the Mutex functions as in Whetstone above, sometimes leading to no performance gains using multiple threads. Random access is also demonstrated as being relatively slow where burst data transfers are involved.
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
Raspberry Pi Stress Tests
I am currently producing a series of stress tests for the Raspberry Pi, based on my reliability/burn-in programs for Windows and Linux. Links can be found in:
http://www.roylongbottom.org.uk/#anchorReliability
For reliability testing purposes, these programs have run time parameters that determine running time and, sometimes, which particular hardware to use. Most also include performance measurements, reported at regular intervals, to identify speed reductions due to such as overheating or system interference. The test programs can be run via command lines in shell scripts, including allowing multiple programs to be run at the same time, each in its own Terminal window. The commands sometimes include an option for different log file names, for cleaner results when more than one copy of the same program is run. A temperature recording application can be included in the mix. All test programs check numeric answers or data transfers for correct or consistent values and report in the log files if incorrect.
The first example is included in the following benchmark report. This is for a benchmark running in a reliability testing mode, alongside my OpenGL benchmark and temperature measuring program. In this case, the RPi crashed due to overheating when overclocked.
http://www.roylongbottom.org.uk/Raspber ... m#anchor29
There will be three new tests. The first is for floating point instructions, where a run time parameter selects data volume and FPU loading, to test CPU, L1 cache, L2 cache or RAM. The second does the same sort of thing except using integers. The third is a drive test that can be used to test the SD card, USB connected drives or LAN based storage. All can be run for extended periods, if desired. The main use will be to provide peace of mind in showing that particular hardware might not really be faulty. I will start a new topic for these tests.
I am currently producing a series of stress tests for the Raspberry Pi, based on my reliability/burn-in programs for Windows and Linux. Links can be found in:
http://www.roylongbottom.org.uk/#anchorReliability
For reliability testing purposes, these programs have run time parameters that determine running time and, sometimes, which particular hardware to use. Most also include performance measurements, reported at regular intervals, to identify speed reductions due to such as overheating or system interference. The test programs can be run via command lines in shell scripts, including allowing multiple programs to be run at the same time, each in its own Terminal window. The commands sometimes include an option for different log file names, for cleaner results when more than one copy of the same program is run. A temperature recording application can be included in the mix. All test programs check numeric answers or data transfers for correct or consistent values and report in the log files if incorrect.
The first example is included in the following benchmark report. This is for a benchmark running in a reliability testing mode, alongside my OpenGL benchmark and temperature measuring program. In this case, the RPi crashed due to overheating when overclocked.
http://www.roylongbottom.org.uk/Raspber ... m#anchor29
There will be three new tests. The first is for floating point instructions, where a run time parameter selects data volume and FPU loading, to test CPU, L1 cache, L2 cache or RAM. The second does the same sort of thing except using integers. The third is a drive test that can be used to test the SD card, USB connected drives or LAN based storage. All can be run for extended periods, if desired. The main use will be to provide peace of mind in showing that particular hardware might not really be faulty. I will start a new topic for these tests.
Re: Raspberry Pi Benchmarks
Odd, the CPU isn't getting hot enough to drop its frequency - think that happens at 80degC which I think is where throttling comes in. So I'd say the fault wasn't caused overheating (of the SoC). Dom will maybe have more information - I'll prod him to take a look.
Have you tried just the one device? And is your GPU firmware up to date? You may have exposed an issue elsewhere that may have been fixed.
Have you tried just the one device? And is your GPU firmware up to date? You may have exposed an issue elsewhere that may have been fixed.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.
Re: Raspberry Pi Benchmarks
Hi,
I'm looking for a test suite that can provide some feelings regarding performances inprovment reached by overclock.
Can you suggest a suite or a single utility that can run a set of test?
I'm looking for a test suite that can provide some feelings regarding performances inprovment reached by overclock.
Can you suggest a suite or a single utility that can run a set of test?
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
My benchmarks are described in the following, with normal and overclocked results shown.mcgyver83 wrote:Hi,
I'm looking for a test suite that can provide some feelings regarding performances inprovment reached by overclock.
Can you suggest a suite or a single utility that can run a set of test?
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
Benchmarks and source codes are in the following. Downloaded benchmarks need execution permission setting. You can change source codes if you want - all FREE.
http://www.roylongbottom.org.uk/Raspber ... hmarks.zip
-
- Posts: 385
- Joined: Fri Apr 12, 2013 9:27 am
- Location: Essex, UK
- Contact: Website
Re: Raspberry Pi Benchmarks
How accurate is the temperature monitor? Although not done scientifically, on a recent test, I poked my temperature probe wire through the hole onto the chip and measurements were similar to the programmed recorder. So is the latter measuring case temperature and does the 80degC spec reflect this or core temperature that will be higher? Then which core CPU or GPU?jamesh wrote:Odd, the CPU isn't getting hot enough to drop its frequency - think that happens at 80degC which I think is where throttling comes in. So I'd say the fault wasn't caused overheating (of the SoC). Dom will maybe have more information - I'll prod him to take a look.
Have you tried just the one device? And is your GPU firmware up to date? You may have exposed an issue elsewhere that may have been fixed.
I only have one RPi and have not upgraded GPU firmware. Perhaps you could provide a link that tells me how to do it.
Re: Raspberry Pi Benchmarks
sudo rpi-updateRoyLongbottom wrote:How accurate is the temperature monitor? Although not done scientifically, on a recent test, I poked my temperature probe wire through the hole onto the chip and measurements were similar to the programmed recorder. So is the latter measuring case temperature and does the 80degC spec reflect this or core temperature that will be higher? Then which core CPU or GPU?jamesh wrote:Odd, the CPU isn't getting hot enough to drop its frequency - think that happens at 80degC which I think is where throttling comes in. So I'd say the fault wasn't caused overheating (of the SoC). Dom will maybe have more information - I'll prod him to take a look.
Have you tried just the one device? And is your GPU firmware up to date? You may have exposed an issue elsewhere that may have been fixed.
I only have one RPi and have not upgraded GPU firmware. Perhaps you could provide a link that tells me how to do it.
gets the latest released firmware. This is bleeding edge, so may have rough edges, but is usually fine. Lots of people use it. If you dont have rpi-update, get it here https://github.com/Hexxeh/rpi-update
The SoC itself is mostly GPU - that takes up the majority of the die area. The temp monitor is on the GPU and is pretty accurate IIRC. Not sure exactly how it works though. It is I believe what is used by the dynamic clocking to ensure the chip doesn't get too hot.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.