The driver has now the first implementation of the spi_prepare_message interface that will go live with linux 3.14.
So it is currently a bit of a "hack" getting it in place without any updates to the spi infrastructure.
Also the drivers which want to make use of it have to call this interface "manually" to make use of it.
So as I am also working on a new driver for the CAN controller mcp2515 I have modified this one to make use of that interface.
The result is:
Without prepare I see:
- 14.6k interrupts
- 17.2k context-switches
- 88%CPU-System load
- 29.2k interrupts
- 34.5k context-switches
- 80%CPU-System load
Unfortunately I have not had the time yet to look at the things actually happening on the SPI Bus lines itself.
But from what it looks like we are doubling the interrupt and context-switches but reducing system load.
One interpretation is that the prepared DMA version is much faster, so that it handles each one of these packets individually with an interrupt while the version that is not using the prepared statement seems to be handling 2 packets at a time and is thus at a higher risk of a buffer-overflow in the controller itself.
As said: I can not confirm this guess yet, but at least it looks promising...
If I can confirm this guess, then I will try to take the next step and try to work without any external thread for SPI queuing (and the corresponding context-switches), which should REALLY bring down CPU utilization, context switches and interrupts...
But again: this is based on a driver that is "focused" on streamlined SPI transfers already.
Theoretically with the "inline DMA queueing" approach, it should work with 3 interrupts/message and no context-switch.
Right now calculating interrupts and context switches using the one_message_approach I would get to 4 interrupts/message and the same amount of context switches. Those context switches themselves may require additional interrupts.
Also the above does NOT account for other overheads due to the packet delivery framework in Linux.
When removing the "overhead" of packet-delivery (by commenting out that part) CPU load goes down to 65% CPU, but interrupts increase further to 30.8k and context switches to 36.5k.
"Abusing" some counters I can say that for the version without "prepare", I see that out of 3438 packets 1528 packets are delivered from the second CAN buffer.
But with the prepared interface the counters show: only 57 Packets out of 3283 packets get delivered from the second buffer.
So this really explains the doubling of the number of interrupts - even _without_ looking at the logic analyzer...
P.s: one more observation - as you have seen, the packet source is not 100% constant, but that is due to the fact that it will reset itself every 4 seconds (which results in a time without messages). And as the above values are taken from 10 second averages that means we see some variation - depending on when the "sampling" of those 10 second starts.