Page **3** of **8**

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 4:59 am**

by **scruss**

I picked up a used

Firefly-RK3288 from a friend. It has a 4-core Rockchip 3288 ARM Cortex-A17 clocked at (IIRC) 1.3 GHz. It's got 2 GB RAM, and with gcc 7 under Ubuntu 18.04, it's no slouch:

*Firefly-RK3288* - pichart-rk3288-openmp.png (43.11 KiB) Viewed 3627 times

Results:

Code: Select all

```
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=0.804635 Mops=1161.18
Merge Sort N=16777216 Workers=8 Sec=0.723482 Mops=556.549
Fourier Transform N=4194304 Workers=8 Sec=1.4824 Mflops=311.234
Lorenz 96 N=32768 K=16384 Workers=4 Sec=0.894368 Mflops=3601.68
The Firefly-RK3288 OpenMP has Raspberry Pi ratio=25.9055
Making pie charts...done.
real 2m52.516s
user 5m32.200s
sys 0m1.040s
```

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 7:36 am**

by **jahboater**

Here are the numbers for the new Odroid-N2

Code: Select all

```
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=6 Sec=0.478389 Mops=1953.07
Merge Sort N=16777216 Workers=12 Sec=0.57364 Mops=701.927
Fourier Transform N=4194304 Workers=12 Sec=0.847415 Mflops=544.448
Lorenz 96 N=32768 K=16384 Workers=12 Sec=0.465711 Mflops=6916.79
The Odroid-N2 has Raspberry Pi ratio=42.3263
Making pie charts...done.
```

This is a hex core Amlogic S922X SoC, four Cortex-A73 cores and two Cortex-A53 cores with 4GB of RAM. Clocked at 1.8GHz and 1.896GHz respectively.

GCC 9.1, Ubuntu 18.04

Not sure whats the best compiler options for big.little? The instruction sets are obviously compatible but the tuning is very different (mostly the scheduling as the A73 is "out-of-order" and the A53 is not).

I used "-mcpu=cortex-a73 -mtune=cortex-a73" because there are more of those cores.

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 8:16 am**

by **bensimmo**

Go for the fast core, it should switch everything over to run on that if speed is needed

As with 8 core beasts too they are really just 4 slow and efficient or then use the 4 fast and a bit more power hungry iirc.

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 8:55 am**

by **jahboater**

Thanks, makes sense.

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 9:58 am**

by **bensimmo**

jahboater wrote: ↑Mon May 13, 2019 8:55 am

Thanks, makes sense.

But don't quote me, how it does it and if it can use all 6 (or 8) I do not know.

That's purely what I have read and understood how it behaves.

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 10:11 am**

by **jahboater**

bensimmo wrote: ↑Mon May 13, 2019 9:58 am

jahboater wrote: ↑Mon May 13, 2019 8:55 am

Thanks, makes sense.

But don't quote me, how it does it and if it can use all 6 (or 8) I do not know.

That's purely what I have read and understood how it behaves.

For big jobs it uses all the cores, even the two little ones.

I built GCC 9.1 using "make -j 6" and all six cores were active most of the time.

Also in the Piechart run above, its using 6 or 12 worker threads.

Oddly, the big cores are clocked at 1800MHz and the little cores are clocked at 1896Mhz.

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 3:18 pm**

by **scruss**

jahboater wrote: ↑Mon May 13, 2019 7:36 am

Here are the numbers for the new Odroid-N2 …

Neat results from a not-too-expensive board.

While people might complain about the Raspberry Pi's µSD card, updating this Firefly board was a pain:

- take it out of its (very nice) metal case so I could access the reset and recovery keys more easily
- attach an OTG USB cable to flash the eMMC
- attach a TTL serial cable so I could see the boot messages
- do a Vulcan death grip involving the reset and recovery keys and the power connector to get it into recovery mode
- download seemingly random binary x86 Linux only tools to access the flash
- download a large image file from the vendor's Google Drive
- spend about 10 minutes watching the OS image trickle over the OTG cable via the flash tool
- watch the system reboot in the serial window, but find it doesn't take keyboard entry to shut it down
- pull power plug, put it back in its case, get it to where there's a keyboard and monitor and plug it in

I really don't miss using uboot and serial connections like on the old SheevaPlug. And Firefly are supposed to have some of the

*better* support communities, too …

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 3:37 pm**

by **ejolson**

bensimmo wrote: ↑Mon May 13, 2019 9:58 am

jahboater wrote: ↑Mon May 13, 2019 8:55 am

Thanks, makes sense.

But don't quote me, how it does it and if it can use all 6 (or 8) I do not know.

That's purely what I have read and understood how it behaves.

I didn't know the N2 was shipping already. You can use numactl or taskset to exclude certain cores from participating in the Pi Pie Chart benchmark.

While the code itself does some automated tuning related to how many OpenMP workers are used for the calculation, for machines with four simultaneous hardware multithreads per core or multiple sockets I've been able to achieve higher performance by restricting on what part of the hardware the program runs. In one case the parallel FFT part of the test ran faster by telling the scheduler to allocate all memory and threads on only one socket.

As you can see I've updated the code to output an additional number indicating relative speed compared to the original 700MHz ARMv6-based Raspberry Pi. For easier comparison the single-threaded version now graphs the single-threaded results for the reference machines. Less visibly the code has been updated to make it compatible with a wider variety of compilers.

It would be interesting what the performance is when using numactl or taskset to schedule only the big cores on systems with such an architecture.

### Re: A Pi Pie Chart

Posted: **Mon May 13, 2019 6:00 pm**

by **jahboater**

Here is a more complete set of results.

According to the comments in boot.ini CPU's 0 and 1 are the little A53's, and CPU's 2,3,4,5 are the big A73's.

So using taskset .....

Code: Select all

```
odroid@odroid:~/pichart-30$ taskset -c 2,3,4,5 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=0.640666 Mops=1458.37
Merge Sort N=16777216 Workers=8 Sec=1.242 Mops=324.198
Fourier Transform N=4194304 Workers=4 Sec=1.86739 Mflops=247.069
Lorenz 96 N=32768 K=16384 Workers=4 Sec=1.8518 Mflops=1739.51
My Computer has Raspberry Pi ratio=18.8527
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=1.91583 Mops=487.689
Merge Sort N=16777216 Workers=2 Sec=3.89942 Mops=103.26
Fourier Transform N=4194304 Workers=2 Sec=8.61634 Mflops=53.5463
Lorenz 96 N=32768 K=16384 Workers=4 Sec=12.9881 Mflops=248.013
My Computer has Raspberry Pi ratio=4.51557
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0-5 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=6 Sec=0.478178 Mops=1953.93
Merge Sort N=16777216 Workers=12 Sec=0.574004 Mops=701.482
Fourier Transform N=4194304 Workers=12 Sec=0.863874 Mflops=534.075
Lorenz 96 N=32768 K=16384 Workers=12 Sec=0.785385 Mflops=4101.46
My Computer has Raspberry Pi ratio=36.9623
Making pie charts...done.
odroid@odroid:~/pichart-30$ ./pichart-serial "N2"
pichart -- Raspberry Pi Performance Serial version 30
Prime Sieve P=14630843 Workers=2 Sec=2.55921 Mops=365.084
Merge Sort N=16777216 Workers=2 Sec=3.26208 Mops=123.434
Fourier Transform N=4194304 Workers=2 Sec=6.02412 Mflops=76.5876
Lorenz 96 N=32768 K=16384 Workers=2 Sec=3.08463 Mflops=1044.28
My Computer has Raspberry Pi ratio=6.88009
Making pie charts...done.
```

### Re: A Pi Pie Chart

Posted: **Tue May 14, 2019 6:33 am**

by **ejolson**

jahboater wrote: ↑Mon May 13, 2019 6:00 pm

Here is a more complete set of results.

According to the comments in boot.ini CPU's 0 and 1 are the little A53's, and CPU's 2,3,4,5 are the big A73's.

So using taskset .....

Code: Select all

```
odroid@odroid:~/pichart-30$ taskset -c 2,3,4,5 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=0.640666 Mops=1458.37
Merge Sort N=16777216 Workers=8 Sec=1.242 Mops=324.198
Fourier Transform N=4194304 Workers=4 Sec=1.86739 Mflops=247.069
Lorenz 96 N=32768 K=16384 Workers=4 Sec=1.8518 Mflops=1739.51
My Computer has Raspberry Pi ratio=18.8527
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=1.91583 Mops=487.689
Merge Sort N=16777216 Workers=2 Sec=3.89942 Mops=103.26
Fourier Transform N=4194304 Workers=2 Sec=8.61634 Mflops=53.5463
Lorenz 96 N=32768 K=16384 Workers=4 Sec=12.9881 Mflops=248.013
My Computer has Raspberry Pi ratio=4.51557
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0-5 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=6 Sec=0.478178 Mops=1953.93
Merge Sort N=16777216 Workers=12 Sec=0.574004 Mops=701.482
Fourier Transform N=4194304 Workers=12 Sec=0.863874 Mflops=534.075
Lorenz 96 N=32768 K=16384 Workers=12 Sec=0.785385 Mflops=4101.46
My Computer has Raspberry Pi ratio=36.9623
Making pie charts...done.
odroid@odroid:~/pichart-30$ ./pichart-serial "N2"
pichart -- Raspberry Pi Performance Serial version 30
Prime Sieve P=14630843 Workers=2 Sec=2.55921 Mops=365.084
Merge Sort N=16777216 Workers=2 Sec=3.26208 Mops=123.434
Fourier Transform N=4194304 Workers=2 Sec=6.02412 Mflops=76.5876
Lorenz 96 N=32768 K=16384 Workers=2 Sec=3.08463 Mflops=1044.28
My Computer has Raspberry Pi ratio=6.88009
Making pie charts...done.
```

Those results are quite strange. It seems unreasonable that adding two little cores would double the performance.

Do you imagine the big cores are throttling when they're all running? Alternatively, it could happen that the identification of which cores are big and which little is mixed up. I'll also recheck my code to make sure the parallel part scales reasonably between 4 and 6 cores. Would it be possible to test each core separately with taskset and the serial code to verify which are which?

### Re: A Pi Pie Chart

Posted: **Tue May 14, 2019 8:24 am**

by **ejolson**

ejolson wrote: ↑Tue May 14, 2019 6:33 am

I'll also recheck my code to make sure the parallel part scales reasonably between 4 and 6 cores.

Here is the output for a set of runs on an 8-core ARM Cortex A53 system running in 64-bit mode:

Code: Select all

```
$ taskset -c 0 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=1 Sec=5.22214 Mops=178.917
Merge Sort N=16777216 Workers=1 Sec=4.54134 Mops=88.664
Fourier Transform N=4194304 Workers=1 Sec=5.44227 Mflops=84.776
Lorenz 96 N=32768 K=16384 Workers=1 Sec=6.06767 Mflops=530.883
My Computer has Raspberry Pi ratio=4.58997
Making pie charts...done.
$ taskset -c 0,1 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=2 Sec=2.61991 Mops=356.626
Merge Sort N=16777216 Workers=4 Sec=2.29244 Mops=175.644
Fourier Transform N=4194304 Workers=2 Sec=3.1076 Mflops=148.466
Lorenz 96 N=32768 K=16384 Workers=2 Sec=3.13086 Mflops=1028.86
My Computer has Raspberry Pi ratio=8.78213
Making pie charts...done.
$ taskset -c 0,1,2 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=3 Sec=1.75374 Mops=532.762
Merge Sort N=16777216 Workers=6 Sec=1.55268 Mops=259.328
Fourier Transform N=4194304 Workers=6 Sec=2.3963 Mflops=192.536
Lorenz 96 N=32768 K=16384 Workers=3 Sec=2.13824 Mflops=1506.49
My Computer has Raspberry Pi ratio=12.5634
Making pie charts...done.
$ taskset -c 0,1,2,3 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=1.31822 Mops=708.779
Merge Sort N=16777216 Workers=8 Sec=1.18948 Mops=338.511
Fourier Transform N=4194304 Workers=4 Sec=1.93407 Mflops=238.55
Lorenz 96 N=32768 K=16384 Workers=4 Sec=1.60808 Mflops=2003.16
My Computer has Raspberry Pi ratio=16.3394
Making pie charts...done.
$ taskset -c 0,1,2,3,4 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=5 Sec=1.05731 Mops=883.685
Merge Sort N=16777216 Workers=10 Sec=0.967107 Mops=416.348
Fourier Transform N=4194304 Workers=10 Sec=1.70527 Mflops=270.557
Lorenz 96 N=32768 K=16384 Workers=5 Sec=1.31937 Mflops=2441.48
My Computer has Raspberry Pi ratio=19.7156
Making pie charts...done.
$ taskset -c 0,1,2,3,4,5 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=6 Sec=0.882527 Mops=1058.7
Merge Sort N=16777216 Workers=12 Sec=0.797551 Mops=504.862
Fourier Transform N=4194304 Workers=12 Sec=1.68901 Mflops=273.162
Lorenz 96 N=32768 K=16384 Workers=6 Sec=1.08304 Mflops=2974.23
My Computer has Raspberry Pi ratio=22.7944
Making pie charts...done.
$ taskset -c 0,1,2,3,4,5,6 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=7 Sec=0.757242 Mops=1233.86
Merge Sort N=16777216 Workers=14 Sec=0.719012 Mops=560.009
Fourier Transform N=4194304 Workers=7 Sec=1.6323 Mflops=282.652
Lorenz 96 N=32768 K=16384 Workers=7 Sec=0.921426 Mflops=3495.91
My Computer has Raspberry Pi ratio=25.5247
Making pie charts...done.
$ taskset -c 0,1,2,3,4,5,6,7 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=8 Sec=0.663588 Mops=1407.99
Merge Sort N=16777216 Workers=16 Sec=0.645496 Mops=623.789
Fourier Transform N=4194304 Workers=8 Sec=1.5337 Mflops=300.824
Lorenz 96 N=32768 K=16384 Workers=8 Sec=0.807844 Mflops=3987.44
My Computer has Raspberry Pi ratio=28.4481
Making pie charts...done.
```

Presented graphically this looks like

The scaling appears fairly uniform without anything surprising, which is expected because all cores are identical. This suggests the code itself is working fine and that there is something strange going on with the N2 hardware. I suspect you have misidentified which cores are the little ones; however, some sort of throttling could also be involved.

### Re: A Pi Pie Chart

Posted: **Tue May 14, 2019 8:04 pm**

by **jahboater**

ejolson wrote: ↑Tue May 14, 2019 8:24 am

I suspect you have misidentified which cores are the little ones

This is quite likely. All I went on was this comment in boot.ini (the equivalent of config.txt).

Code: Select all

```
# max cpu-cores
# Note:
# CPU's 0 and 1 are the A53 (small cores)
# CPU's 2 to 5 are the A73 (big cores)
# Lowering this value disables only the bigger cores (the last cores).
# setenv maxcpus "4"
# setenv maxcpus "5"
setenv maxcpus "6"
```

They run at different speeds: A73's are clocked at 1800MHz, the A53's are clocked at 1896MHz.

ejolson wrote: ↑Tue May 14, 2019 8:24 am

however, some sort of throttling could also be involved.

The N2 is claimed not to throttle.

It is12nm and has a really massive heat sink factory fitted.

Here are the six runs with increasing CPU counts (count: Pi Ratio)

Code: Select all

```
1: 3.06463
2: 4.50112
3: 17.1603
4: 25.8384
5: 31.6286
6: 41.9504
```

Code: Select all

```
odroid@odroid:~/pichart-30$ taskset -c 0 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=2 Sec=3.83186 Mops=243.832
Merge Sort N=16777216 Workers=2 Sec=5.2353 Mops=76.9112
Fourier Transform N=4194304 Workers=2 Sec=13.3536 Mflops=34.5506
Lorenz 96 N=32768 K=16384 Workers=2 Sec=14.7099 Mflops=218.983
My Computer has Raspberry Pi ratio=3.06463
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=2 Sec=1.92459 Mops=485.469
Merge Sort N=16777216 Workers=2 Sec=4.19914 Mops=95.8895
Fourier Transform N=4194304 Workers=2 Sec=9.09953 Mflops=50.703
Lorenz 96 N=32768 K=16384 Workers=1 Sec=11.5153 Mflops=279.735
My Computer has Raspberry Pi ratio=4.50112
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1,2 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=6 Sec=1.2032 Mops=776.534
Merge Sort N=16777216 Workers=6 Sec=1.29098 Mops=311.897
Fourier Transform N=4194304 Workers=6 Sec=1.51315 Mflops=304.91
Lorenz 96 N=32768 K=16384 Workers=6 Sec=1.70545 Mflops=1888.78
My Computer has Raspberry Pi ratio=17.1603
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1,2,3 ./pichart-openmp "N2"
#pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=8 Sec=0.796118 Mops=1173.61
Merge Sort N=16777216 Workers=4 Sec=0.966202 Mops=416.738
Fourier Transform N=4194304 Workers=8 Sec=1.11916 Mflops=412.248
Lorenz 96 N=32768 K=16384 Workers=8 Sec=0.905878 Mflops=3555.91
My Computer has Raspberry Pi ratio=25.8384
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1,2,3,4 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=10 Sec=0.594597 Mops=1571.36
Merge Sort N=16777216 Workers=10 Sec=0.776335 Mops=518.659
Fourier Transform N=4194304 Workers=10 Sec=0.997077 Mflops=462.726
Lorenz 96 N=32768 K=16384 Workers=10 Sec=0.754667 Mflops=4268.41
My Computer has Raspberry Pi ratio=31.6286
Making pie charts...done.
odroid@odroid:~/pichart-30$ taskset -c 0,1,2,3,4,5 ./pichart-openmp "N2"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=6 Sec=0.478357 Mops=1953.2
Merge Sort N=16777216 Workers=12 Sec=0.579982 Mops=694.251
Fourier Transform N=4194304 Workers=12 Sec=0.852847 Mflops=540.98
Lorenz 96 N=32768 K=16384 Workers=12 Sec=0.474342 Mflops=6790.93
My Computer has Raspberry Pi ratio=41.9504
Making pie charts...done.
```

The number of workers doesn't seem to correlate with the number of CPU cores?

### Re: A Pi Pie Chart

Posted: **Wed May 15, 2019 12:00 am**

by **ejolson**

jahboater wrote: ↑Tue May 14, 2019 8:04 pm

Here are the six runs with increasing CPU counts (count: Pi Ratio)

Code: Select all

```
1: 3.06463
2: 4.50112
3: 17.1603
4: 25.8384
5: 31.6286
6: 41.9504
```

That output seems to confirm that 0 and 1 are the little cores. It also seems to indicate some sort of throttling. For example, the performance when switching from one to two cores should about double, whereas you have only a 1.47 factor increase. I'd also expect a single little core to have a pi ratio closer to 5 not 3. That's a sign of something not being right.

It is amazing how much the heatsink on the N2 resembles a sandwich toaster. I wonder if the one mentioned

here is similar. Could there be a gap between the SOC mounted on the bottom of the circuit board and the heatsink? I think the system includes instrumentation that allows you to monitor the heat and current speed of the CPU cores, because graphs of such things appear on the N2 vendor's webpage. Maybe a shim or some thermal paste would improve things.

Alternatively, maybe the scheduler is acting weird because processor affinity is set to the little cores which then become compute bound. Have you checked if changing the performance setting of the Linux scheduler makes a difference?

### Re: A Pi Pie Chart

Posted: **Wed May 15, 2019 12:34 am**

by **ejolson**

Here are results for the NVIDIA Tegra TX2 big.LITTLE architecture. This architecture consists of the following cores:

- Four 2 GHz Cortex-A57 cores numbered 0,3,4,5
- Two 2 GHz Denver 2 cores numbered 1,2

Single-threaded results for the two different cores are

Code: Select all

```
$ taskset -c 0 ./pichart-serial
pichart -- Raspberry Pi Performance Serial version 30
Prime Sieve P=14630843 Workers=1 Sec=1.78459 Mops=523.553
Merge Sort N=16777216 Workers=2 Sec=3.51306 Mops=114.616
Fourier Transform N=4194304 Workers=2 Sec=2.58702 Mflops=178.341
Lorenz 96 N=32768 K=16384 Workers=1 Sec=1.7412 Mflops=1850.01
My Computer has Raspberry Pi ratio=10.533
Making pie charts...done.
$ taskset -c 1 ./pichart-serial
pichart -- Raspberry Pi Performance Serial version 30
Prime Sieve P=14630843 Workers=1 Sec=1.46102 Mops=639.504
Merge Sort N=16777216 Workers=2 Sec=2.6922 Mops=149.563
Fourier Transform N=4194304 Workers=1 Sec=1.49668 Mflops=308.265
Lorenz 96 N=32768 K=16384 Workers=1 Sec=0.892562 Mflops=3608.97
My Computer has Raspberry Pi ratio=16.0375
Making pie charts...done.
```

which indicates the Denver cores have roughly 1.5 times the integer performance and 2 times the floating-point performance of the A57 cores. Note that I didn't tune the compiler optimization flags very carefully, so it is likely that faster timings are possible for the Tegra-TX2 hardware. The full set of tests may be summarized as

Code: Select all

```
A57 Den Pi-Ratio Prime Merge Fourier Lorenz
1 0 10.5330 523.553 114.616 178.341 1850.01
0 1 16.0375 639.504 149.563 308.265 3608.97
2 0 20.2125 1049.53 228.756 339.554 3293.35
1 1 23.6960 1136.82 256.381 481.260 3615.54
3 0 28.7771 1574.30 338.459 422.827 4896.22
0 2 31.2410 1272.69 288.663 584.764 7132.36
2 1 32.1336 1650.66 347.954 622.039 4800.34
1 2 35.9372 1780.66 392.443 696.115 5515.29
4 0 38.5189 2098.16 450.227 579.577 6467.59
3 1 40.9209 2164.41 447.678 738.529 6302.87
2 2 47.1051 2293.69 504.744 951.607 7188.47
4 1 49.4959 2705.64 565.161 813.556 7760.25
3 2 53.0869 2799.29 604.649 974.044 7749.05
4 2 62.1914 3322.65 709.686 1114.95 9152.58
```

It is likely some sort of statistical analysis of the above table of numbers could determine whether the system is performing as expected or not. Instead of doing that, I reran a few of the tests to check that the results are at least repeatable. For example, if there was throttling one might expect the performance to decrease over repeated runs. Repeated Pi-Ratio results for selected combinations given by

Code: Select all

```
A57 Den Run-1 Run-2 Run-3 Run-4 Run-5
0 2 31.2410 31.4802 31.7105 31.6916 31.6474
4 0 38.5189 38.2766 38.3444 38.6743 38.5121
4 2 62.1914 61.7359 62.2623 62.1156 61.9489
```

show the tests are fairly repeatable.

### Re: A Pi Pie Chart

Posted: **Wed May 15, 2019 12:55 am**

by **Gavinmc42**

What are the figures for a Intel Celeron Core Duo?

Was that the first multicore Intel?

### Re: A Pi Pie Chart

Posted: **Wed May 15, 2019 1:12 am**

by **Andyroo**

Gavinmc42 wrote: ↑Wed May 15, 2019 12:55 am

What are the figures for a Intel Celeron Core Duo?

Was that the first multicore Intel?

Thought the Celeron came out four or five years after the Pentium D that had two cores. Not sure about the RISC or micro controller ranges Intel had created before this though as some of them had multiple chip solutions, do they count, and some where built for supercomputers and only worked in massively parallel sets

### Re: A Pi Pie Chart

Posted: **Wed May 15, 2019 1:51 am**

by **ejolson**

Gavinmc42 wrote: ↑Wed May 15, 2019 12:55 am

What are the figures for a Intel Celeron Core Duo?

Was that the first multicore Intel?

It seems I sent my core-duo system to surplus last month. Instead of replacing a third HD in that iMac, I wanted to use the desk space for something useful like pencil and paper.

The closest timing currently available here is for the previous generation 2.8 GHz Pentium 4 D. Those were multi-chip packages consisting of two dies next to each other similar to Epyc and the Threadripper today.

https://www.raspberrypi.org/forums/view ... 5#p1401734

### Re: A Pi Pie Chart

Posted: **Wed May 15, 2019 8:25 am**

by **bensimmo**

Andyroo wrote: ↑Wed May 15, 2019 1:12 am

Gavinmc42 wrote: ↑Wed May 15, 2019 12:55 am

What are the figures for a Intel Celeron Core Duo?

Was that the first multicore Intel?

Thought the Celeron came out four or five years after the Pentium D that had two cores. Not sure about the RISC or micro controller ranges Intel had created before this though as some of them had multiple chip solutions, do they count, and some where built for supercomputers and only worked in massively parallel sets

Pentium D 8## was the first bolted together Dual Core Pentium in 2005 (AMD with it's Athlon 64 X2 being a 'proper' one iirc)

First 'proper' was two years later named the Pentium Dual-Core (the start of the 'core' range before it was renamed)

(EDIT looked it up, 2008 Celeron gained Dual-Core, in the E/T1#00 range )

### Re: A Pi Pie Chart

Posted: **Thu May 16, 2019 3:01 am**

by **Gavinmc42**

Wow, the Core Duo's are only just over 10 years old, I thought they were ancient

My Core Duo only seems about twice as fast as my Pi3B+but i have not bench-marked it enough.

Wonder what Pi's will be like in 10years?

### Re: A Pi Pie Chart

Posted: **Thu May 16, 2019 4:53 am**

by **Imperf3kt**

Gavinmc42 wrote: ↑Thu May 16, 2019 3:01 am

Wow, the Core Duo's are only just over 10 years old, I thought they were ancient

My Core Duo only seems about twice as fast as my Pi3B+but i have not bench-marked it enough.

Wonder what Pi's will be like in 10years?

I doubt the legitimacy of those dates.

Both Wikipedia and Intel seem to claim the core 2 range began in July 2006

https://ark.intel.com/content/www/us/en ... z-fsb.html
https://en.wikipedia.org/wiki/Intel_Core_2
I myself had a core 2 Duo in 2006 which I later replaced with a

quad core AM3 in early 2009

### Re: A Pi Pie Chart

Posted: **Thu May 16, 2019 6:27 am**

by **bensimmo**

"Celeron"

What I did forget wa the Pentium Dual-Core wasn't an infill for the Celeron for a bit, but a level above the Celeron. So you can ignore that.

The C2D did start when you say, but these are not Celeron.

https://ark.intel.com/content/www/us/en ... z-fsb.html

### Re: A Pi Pie Chart

Posted: **Thu May 16, 2019 5:53 pm**

by **ejolson**

Gavinmc42 wrote: ↑Wed May 15, 2019 12:55 am

What are the figures for a Intel Celeron Core Duo?

Was that the first multicore Intel?

Looking back through the thread I found

results for a Core Duo T5800 running in an HP G70 laptop. To make the comparison more interesting, I added two more laptops: an Acer C720 Chromebook with Celeron 2955U processor and an HP 15-db0011dx with an AMD A6-9225. This resulted in the following Pi pie chart:

It is possible to compute the Pi ratio by hand using the formula

(Sieve*Sort*Fourier*Lorenz/1608533.6)^(0.25)

Taking the previous timings for the Core Duo T5800, the Raspberry Pi 3B+ and 3B computers we obtain

Code: Select all

```
The Duo T5800 has Raspberry Pi ratio=20.6175
The Pi 3B+ has Raspberry Pi ratio=12.4016
The Pi 3B has Raspberry Pi ratio=10.8197
```

For reference, the runs for the other notebook computers are

Code: Select all

```
$ ./pichart-openmp -t "Cel 2955U"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=2 Sec=1.32935 Mops=702.847
Merge Sort N=16777216 Workers=4 Sec=2.21872 Mops=181.48
Fourier Transform N=4194304 Workers=2 Sec=0.637345 Mflops=723.899
Lorenz 96 N=32768 K=16384 Workers=2 Sec=0.458876 Mflops=7019.82
The Cel 2955U has Raspberry Pi ratio=25.1951
$ ./pichart-openmp -t "A6-9225"
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=2 Sec=0.653126 Mops=1430.55
Merge Sort N=16777216 Workers=4 Sec=1.47106 Mops=273.717
Fourier Transform N=4194304 Workers=4 Sec=1.05959 Mflops=435.427
Lorenz 96 N=32768 K=16384 Workers=2 Sec=0.274606 Mflops=11730.4
The A6-9225 has Raspberry Pi ratio=33.3926
Making pie charts...done.
```

There is a link to the current source code from the first post of this thread if you would like to make your own Pi pie charts.

### Re: A Pi Pie Chart

Posted: **Thu May 16, 2019 9:28 pm**

by **ejolson**

If your Pi computers are not keeping the house warm enough this summer, here are some Pi-ratio results for currently available server hardware.

These results were obtained with a single core using either 2 or 4 simultaneous multithreads and the OpenMP version of the benchmark. Both gcc and clang compilers were run and the best results in each category reported. The Pi ratios are computed by hand as described in the previous post.

Code: Select all

```
SIMULTANEOUS MULTITHREADED SINGLE-CORE PERFORMANCE
Pi-Ratio Prime Merge Fourier Lorenz
IBM Power9 43.3983 1099.17 479.123 1489.01 7276.3
Gold 6126 59.1558 1122.87 560.387 1389.17 22534.4
EPYC 7371 53.8473 1413.59 570.489 1081.44 15506.5
FURTHER HARDWARE INFORMATION (only one core was used)
Base-Clock Domains Cores/Domain Threads/Core
IBM Power9 2.7GHz 2 20 4
Gold 6126 2.6GHz 2 12 2
EPYC 7371 3.1GHz 8 4 2
```

I find it amusing that Power was best at the Fourier transform, Gold was best at the Lorenz 96 simulation and EPYC was best at the prime sieve and merge sort. Note that the above results reflect only single-core performance. Throughput using multiple cores and efficiency when heating the home are entirely different matters.

### Re: A Pi Pie Chart

Posted: **Fri May 17, 2019 5:47 am**

by **ejolson**

jahboater wrote: ↑Tue May 14, 2019 8:04 pm

The number of workers doesn't seem to correlate with the number of CPU cores?

The pichart benchmark starts by over provisioning the available cores with twice the number of worker threads. This is done because over provisioning allows the operating system to assist with load balancing and that sometimes leads to greater performance.

After obtaining timing metrics using twice the workers, the number of workers is halved and the benchmark is run again. The halving process is repeated until only one worker thread is left. In the end, the best time among all tests is reported along with the number of workers that achieved that best time.

In the case of the serial code, it is still possible to break the work into bundles and then execute them consecutively rather than in parallel. Even though no parallel speedup is possible, breaking the computation into smaller parts can result in better use of cache memory.

If the default seems insufficient, the number of workers to start with may be specified using the -n command-line option. Type -h for more information.

### Re: A Pi Pie Chart

Posted: **Fri May 17, 2019 8:10 am**

by **bensimmo**

The T5800 (and I can do others) is mine, *may have throttled*.

I'll get soem others up at some point.

I'll see if the 1GHz PentiumIII-m works