M4H
Posts: 1
Joined: Tue Dec 08, 2015 4:22 pm

Maximum theoretical GPIO sample speed

Tue Dec 08, 2015 4:26 pm

Hi,

What is the maximum theoretical GPIO sample speed when by reading the GPIO pin level registers directly (e.g. GPLEV0)?

I've been playing around with Circle a C++ bare metal programming environment for the Raspberry Pi by rst (there is a post about it in this forum) and one of the examples (https://github.com/rsta2/circle/tree/ma ... -gpioclock) samples the GPIO register as fast as possible and manages a speed of ~10MHz. Searching around on the internet suggests that this is a typical maximum speed others have achieved. Why is this?

Additional questions:
  • The system clock runs at 250MHz so this would suggest ~25 clock cycles per read. Does this sound reasonable?
  • Is it possible that the GPIO are running on a slower clock?
  • Is it possible that the speed is dependent on which other resources are being used, e.g. USB, SPI, DMA, UART?
Kind regards,
M4H

JacobL
Posts: 76
Joined: Sun Apr 15, 2012 2:23 pm

Re: Maximum theoretical GPIO sample speed

Tue Dec 08, 2015 11:00 pm

10MHz means 100ns to access a special function register (SFR) and store the result. That sounds reasonable enough, I know other bigger platforms that take longer than this. Accesses to SFR normally use restrictive access modes, which means that these accesses cannot be reordered, and the pipeline most likely stalls during the operation. SFR accesses normally travel on the APB, which could add some latency as well.

dwelch67
Posts: 955
Joined: Sat May 26, 2012 5:32 pm

Re: Maximum theoretical GPIO sample speed

Thu Dec 10, 2015 6:09 am

I would call 10Mhz pretty damn fast. Not sure if we are actually going to ever know "why". first there is the amba/axi bus, second it goes through some logic and shares a bus with the gpu, then the peripheral itself, we dont know what it is clocked from 250Mhz or a derivative of that. then there are the I/O pads themselves, what are they rated at? most of these we dont know. some of it could be your code too, register choices how you loop, branch prediction enabled, where the branch lands in the cache line seems to matter, having the cache on, etc.

You could take the same code and pound on a ram address (with data cache enabled) and see how fast your sampling rate is for that as a comparison. maybe poll some csrs in the gpio or other peripherals, ones that are for setup and not for touching the I/O itself. see how fast you can sample those...

User avatar
joan
Posts: 14266
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Maximum theoretical GPIO sample speed

Thu Dec 10, 2015 9:00 am

Is there anything special about the level register?

http://codeandlife.com/2015/03/25/raspb ... benchmark/ claims old Pis can toggle the GPIO at 22 MHz and the new Pi2 at 42 MHz. They just write different registers in the GPIO map. Are the benchmarks suspect? That suggests a plain read should be rather more than 10 MHz.

rst
Posts: 409
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: Maximum theoretical GPIO sample speed

Thu Dec 10, 2015 11:20 am

Yes, it's the GPLEV register. I modified the sample mentioned in the initial post so that it toggles a GPIO pin and it takes 24ms to run this loop 1000000 times on the RPi 2 at 600 MHz CPU clock:

Code: Select all

	ldr	r5, =ARM_GPIO_GPCLR0
	ldr	r7, =ARM_GPIO_GPSET0
	mov	r6, #(1 << GPIO_PIN)

fastloop:
	str	r6, [r5]
	str	r6, [r7]
	subs	r1, r1, #1
	bhi	fastloop
This results in a frequency of 41.7 MHz on the GPIO pin and matches the referenced benchmark.

User avatar
joan
Posts: 14266
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Maximum theoretical GPIO sample speed

Thu Dec 10, 2015 11:28 am

rst wrote:Yes, it's the GPLEV register.
...
How odd.

rst
Posts: 409
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: Maximum theoretical GPIO sample speed

Thu Dec 10, 2015 4:12 pm

I did another test reading the system timers CLO register and the interrupt controllers "IRQ pending 1" register as fast as possible on the RPi 2. The MMU and the caches are on. These memory locations have the "Shareable Device" attribute. The test runs on core 0. Core 1-3 are executing a "wfi" instruction. All clocks are at their default frequency.

I get a read frequency of about 10 MHz in any case. In other words the maximum bandwidth for programmed I/O read operations under these conditions is about 40 MByte/s.

So it's not the GPLEV register but the read access to it.

Return to “Bare metal, Assembly language”