edbird wrote:I keep running into the same problems. I need to be able to do this in software rather than attaching a clock source in hardware. I'm reading 2 lots of 8 bit wide data after an interrupt is triggered, so that's why I need fast GPIO... I'm not requiring a fast, constant clock source, as you suggested above.
If you're taking an interrupt into user-land, (e.g. using wiringPi's ISR code), then the rate you get these interrupts is probably not going to be fast enough - I've benchmarked them at about 66K/sec max. That's 15µS per interrupt...
That may seem shockingly slow but that interrupt goes from the hardware through Linux which wakes up your program and lets it go... If you write a kernel module, then you get them much faster... But you need to write a kernel module... (Or go "bare bones")
I have had an idea today about how I can do this more efficiently. In the external hardware if I store 256 sets of 2x8 bits of data, (in shift-registers say) and then have 1 interrupt to read it all rather than 256 individual interrupts, then that's less interrupts to service, hence less wasted CPU time, or so I hope.
But that still requires reading 512 bytes of data, byte at a time, and hence I'm back to "get fast GPIO".
So anyway, having looked at the "native C" library, I see that some memory is mapped from a /dev/mem device. I don't completely understand what this does, other than create a copy of some memory, somewhere else in memory, which seems kind of pointless.
It doesn't quite copy memory - it's mapping the memory mapped hardware registers and making them direcly accessible from user-land. This is how wiringPi works. The code in wiringPi then has direct access to the memory mapped registers. Where wiringPi is "slow" is that it handles an indirection to map 3 different types of pin numbering to the bit-positions in the hardware registers. If you use wiringPiSetupGpio() then this is the fastest as there is no indirection, but there is still some table-lookups to work out which of the 2 32-bit register banks to use and what bit-position in that register corresponds to the output bit. This is the price (in terms of overall speed) to pay for flexability and ease of use. If you know the bits you're reading
Surely there exist some memory locations on the Raspberry Pi which connect directly to the GPIO (even if some of them are control registers for the GPIO device itself?) Can I not just declare a pointer to all the relevant memory addresses and set those memory locations to values to write data to the gpio and read the value of the pointer to read data back in? (Assuming I set the correct values in control registers first to set what is an input / output etc?) Also I understand that there is some sort of issue with "hardware memory addresses" and "linux memory addresses" and possibly even a 3rd type of memory address. The hardware one being the actual address, and the linux one being the address linux uses instead of the actual address. I have no clue why this is done however...
Yes, these exist and that's exactly what the mmap() calls are doing. You then declare a pointer to those memory mapped regions and it goes directly to the hardware without passing 'go' or collecting £200 ...
My guess is that this would be the absolute fastest way, and that performance would then be limited by the CPU, hence under ideal situations, you would get a maximum bit-bash rate of 1/2 a giga baud. (Half of 1000 MHz overclocked R-Pi clock speed, assuming that 1 instruction was set the gpio high and the next was to set it low again and that these instructions executed in 1 cycle, which clearly they will not.)
The fastest you will probably easily get via GPIO might be via the SPI interface. You can clock that at 32Mb/sec. That still falls short of what you need by the sounds of it though, however if it's 8-bit data, then maybe you can buffer the incoming data in SRAM, then pull it on the Pi, 8-bits at a time. (with some sort of clocked address generator, tri-state buffers, etc. might be less chips than chift-registers)
I also suggest you look at the pianalyzer project - they were doing high speed data sampling, but I have a funny feeling the maximum rate they could get to get reliable samples was not much more than 1M samples/sec. (over a number of bits)
-Gordon