I have some questions about the tradeoffs for different ways of reading from an ADC via SPI.
The problem I'm trying to solve is detecting (and timestamping) a fairly short (<1ms) spike in the strength of a signal. The frequency domain is about 1-4kHz, so sampling at about 10ksps should do. This would end up needing a SPI clock frequency of somewhere a bit north of 100kHz since it takes at least 10.5 cycles to read out 1 10-bit sample -- well within the limits of an MCP3002. Additionally, I need to measure for minutes at a time, I don't want to miss any of the relatively infrequent spikes, and I need to measure the elapsed time for a particular spike since the start of the sampling window.
After reading BCM2835 datasheet and a bunch of old threads on SPI, I see a few different approaches are out there and I'm hoping you can help me figure out which one is most appropriate. The extent of my knowledge of SPI is just what I've found in the last few days, so it could well be that I'm missing something obvious. Anyway, here's what I've found:
1. Use spidev ioctl()s, and perform the read-flavored ioctl in a loop after initialization to set the clock frequency, etc.
It seems like people have not had great luck getting very high sampling rates out of this method, but since I only need 10ksps, that's probably attainable.
It's not clear to me how reliable this is in terms of not missing any samples. Does the kernel do some sort of buffering, or is it up to you to call ioctl() enough to not miss any samples? If, for instance, your process gets context switched away, it could be several ms before your process runs again. This can be controlled somewhat with isolcpus, remapping interrupts, and sched_setaffinity to get your process on a mostly idle core, but it's still not great. If the kernel is not doing buffering on my behalf, then presumably I would need to read data before the FIFO fills. I couldn't figure out exactly what the size of the FIFO was for SPI0, though. The low-speed SPI1 and SPI2 have 4-word FIFOs (page 20 in the BCM2835 datasheet), for what that's worth. If SPI0 is similar, then that's only 300us of time at 10ksps.
2. `mmap` the SPI registers and read/write from them.
This will avoid the userspace / kernel transition, but is of course much more fiddly to set up. It suffers from the same context switch problem that #1 does.
I don't see how you would accurately tell when there is a new word of data to read from the FIFO register. You can poll the FIFO register as fast as you want, but how do you know when the result is valid?
Is this what pigpio does? Maybe?
3. `mmap` the DMA control registers and set up the DMA engine to write in a loop to 2 pages, circular buffer style.
This is even more fiddly, but provides much more forgiving timing for being able to read every sample. Instead of a couple of word's worth of 10-bit samples, you could simply map 2 4096-byte pages and configure the DMA to write 4090 (multiple of 10) bytes into each one by having a corresponding infinite loop of control blocks. The DMA engine can be configured to cause interrupts as it shifts from one control block to another, which presumably I could figure out how to handle. (Note errata for p52 though; the control block structure would have to be a bit weird to work around that DMA bug. https://elinux.org/BCM2835_datasheet_errata
) 409 samples is about 40ms worth, a much more forgiving time window for a process to get scheduled and process the recent input data.
DMA does require a lot more totally unsafe poking around in registers, and figuring out which DMA channel the kernel isn't already using, and the always fun problem of creating issues that aren't solved by simply killing the process since the DMA engine will happily continue doing what it's doing until it's told to stop.
4. Finally, I also have another more general SPI/ADC question. The datasheet (http://ww1.microchip.com/downloads/en/D ... 21294C.pdf
-- see section 5) for the ADC says that when CS is brought low, a few bits of configuration (which channel, etc) are written to the ADC, then a few cycles later 10 bits of output can be read out, after which every bit will be zero. Does this mean you need to bring CS high, then low, and re-send the configuration bits for every sample?
Thanks for any insight you can provide!