Since Broadcom released complete documentation for the VideoCore IV GPU back in February 2014 we’ve seen a number of fun uses of our 24GFLOPs of QPU compute, from Andrew Holme’s FFT library to Pete Warden’s deep learning experiments. It’s not unusual to see a 10x increase in performance over the ARM for algorithms with a … Continue reading →
Back in June, we mentioned Pete Warden’s port of the Deep Belief image-recognition SDK to the Pi, which used the VideoCore IV QPUs to provide an accelerated GEMM matrix-multiply function. Since then, Pete’s been optimizing his code, and has reduced the time required to process an image to 3 seconds (versus 20 seconds for the … Continue reading →
Update: As Andrew mentions in the comments below, we recently made the source for Andrew Holme’s accelerated FFT library available. I’d encourage aspiring Pi GPGPU hackers to take a look at this – it’s an incredibly tight piece of code. You may have noticed a certain lack of blog action over the last few days. … Continue reading →