jahboater wrote: ↑Fri May 17, 2019 8:27 am

ejolson wrote: ↑Wed May 15, 2019 12:00 am

jahboater wrote: ↑Tue May 14, 2019 8:04 pm

Here are the six runs with increasing CPU counts (count: Pi Ratio)

Code: Select all

```
1: 3.06463
2: 4.50112
3: 17.1603
4: 25.8384
5: 31.6286
6: 41.9504
```

That output seems to confirm that 0 and 1 are the little cores. It also seems to indicate some sort of throttling. For example, the performance when switching from one to two cores should about double, whereas you have only a 1.47 factor increase. I'd also expect a single little core to have a pi ratio closer to 5 not 3. That's a sign of something not being right. Maybe a shim or some thermal paste would improve things.

Alternatively, maybe the scheduler is acting weird because processor affinity is set to the little cores which then become compute bound. Have you checked if changing the performance setting of the Linux scheduler makes a difference?

I believe the heat sink is operating correctly. With all six cores maxed out for several hours, I saw no reduction in CPU frequency. Placing the board on edge so the fins were vertically aligned reduced the temp by several degrees.

HK do claim that the 12nm N2 "does not throttle". The C2 heatsink did have decent thermal paste, so I am presuming the N2 does too otherwise its pointless them fitting this expensive heatsink!

I'll investigate the Linux scheduling ....

Given that four A57 cores yield a Pi ratio of 38.5 as seen

here, a final Pi ratio of 42 seems reasonable for using all four A73 cores along with the two A53 cores.

I find it strange that two A53 cores in the N2 are not approximately double the performance of one, because on the Raspberry Pi they are. In particular, for the 3B+ one gets

Code: Select all

```
1: 3.64594
2: 7.02809
3: 9.72154
4. 12.2765
```

which shows a near exact factor of two scaling between one and two cores. Moreover, how could two Cortex-A53 cores running at 1.4GHz in the Pi 3B+ outperform two of the same kind of core in the N2 that are supposedly clocked in excess of 1.8 GHz?

For reference the output from the the Raspberry Pi 3B+ is

Code: Select all

```
$ taskset -c 0 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=1 Sec=6.49964 Mops=143.751
Merge Sort N=16777216 Workers=2 Sec=3.77389 Mops=106.695
Fourier Transform N=4194304 Workers=1 Sec=6.63308 Mflops=69.5564
Lorenz 96 N=32768 K=16384 Workers=1 Sec=12.0904 Mflops=266.428
My Computer has Raspberry Pi ratio=3.64594
Making pie charts...done.
$ taskset -c 0,1 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=2 Sec=3.23417 Mops=288.892
Merge Sort N=16777216 Workers=4 Sec=1.91323 Mops=210.457
Fourier Transform N=4194304 Workers=2 Sec=3.72711 Mflops=123.788
Lorenz 96 N=32768 K=16384 Workers=2 Sec=6.17764 Mflops=521.433
My Computer has Raspberry Pi ratio=7.02809
Making pie charts...done.
$ taskset -c 0,1,2 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=3 Sec=2.1494 Mops=434.693
Merge Sort N=16777216 Workers=6 Sec=1.35625 Mops=296.888
Fourier Transform N=4194304 Workers=6 Sec=3.10515 Mflops=148.583
Lorenz 96 N=32768 K=16384 Workers=3 Sec=4.29928 Mflops=749.248
My Computer has Raspberry Pi ratio=9.72154
Making pie charts...done.
$ taskset -c 0,1,2,3 ./pichart-openmp
pichart -- Raspberry Pi Performance OPENMP version 30
Prime Sieve P=14630843 Workers=4 Sec=1.61066 Mops=580.089
Merge Sort N=16777216 Workers=8 Sec=1.11224 Mops=362.019
Fourier Transform N=4194304 Workers=4 Sec=2.61202 Mflops=176.635
Lorenz 96 N=32768 K=16384 Workers=4 Sec=3.27035 Mflops=984.979
My Computer has Raspberry Pi ratio=12.2765
Making pie charts...done.
$ sudo vcgencmd get_throttled
throttled=0x0
```

Graphed as a Pi pie chart the 3B+ per-core scaling looks like