Go to advanced search

by gchamp
Wed Apr 10, 2019 8:43 pm
Forum: Bare metal, Assembly language
Topic: CPU Execution Speed
Replies: 10
Views: 1648

Re: CPU Execution Speed

Right, I didn't find a satisfactory explanation as to why doing operating only with the registers is a bit slower with the MMU off. However, with the MMU configured for a 1 to 1 mapping, everything that touches memory is about 10 to 20 times faster.
by gchamp
Sun Apr 07, 2019 6:48 pm
Forum: Bare metal, Assembly language
Topic: JTAG Recommendations
Replies: 20
Views: 2636

Re: JTAG Recommendations

For what it's worth, I have successfully used OpenOCD to debug a raspberry pi 2B+ via JTAG using this guide and the segger JLINK probe. It is however on the expensive side. The arm DSTREAM probe also works out of the box with ARM DS-5 for raspberry pi 2 and 3 but is even more expensive.
by gchamp
Wed Apr 03, 2019 7:02 pm
Forum: Bare metal, Assembly language
Topic: CPU Execution Speed
Replies: 10
Views: 1648

Re: CPU Execution Speed

I think I understand more clearly what was the issue now. Thank you for your answers and your time. I read a bit more on the behavior of the Cortex-A53 when MMU is disabled and I believe that if the MMU is off, the L1 and L2 data caches are not allocating anything. The instruction cache however work...
by gchamp
Wed Apr 03, 2019 1:32 am
Forum: Bare metal, Assembly language
Topic: CPU Execution Speed
Replies: 10
Views: 1648

Re: CPU Execution Speed

It seems you are right. I re-did the test on 64 bits with MMU enabled and the average cycle per instruction for my loop is about 0.70. I added my code on github . Do you know why enabling the MMU make such a huge difference? From what I understood if the MMU disabled it would simply act as an identi...
by gchamp
Tue Apr 02, 2019 2:32 am
Forum: Bare metal, Assembly language
Topic: CPU Execution Speed
Replies: 10
Views: 1648

Re: CPU Execution Speed

Hi, thank you for the answer. Will L2 cache access really have an impact here? The instructions in the core of the loop fits entirely in L1 cache, so there shoudn't be any cache misses. I could confirm this with the PMU. By the nature of the computation inside the loop, I see there could be some Dca...
by gchamp
Mon Apr 01, 2019 8:49 pm
Forum: Bare metal, Assembly language
Topic: CPU Execution Speed
Replies: 10
Views: 1648

CPU Execution Speed

Hi everyone, I'm trying to get an accurate picture of how fast the Raspberry PI 3B+ executes instructions. I have tested 3 approaches to measure the execution speed which all gives coherent results, but the execution speed is slower than what I anticipated. I have tested timing with the PMCCNTR, wit...

Go to advanced search