Mon Nov 05, 2012 2:48 pm
While you would not want to do that just for a few scalars (the latency cost of the op would be enormous), the normal way to do GPGPU would be through FP32 (single precision) or FP16 (half precision) textures containing the arguments, potentially on a fast client-update path if the arg data is highly dynamic, returning the result into a FP32 or FP16 target, again preferably on a quick read-back path for fast consumption. That said, the current RPi ES stack does not support float textures, and the fast texture update/target readback paths are still being figured out. There have been early attempts on these boards to discover such potential paths but no positive results so far, AFAIK. I plan to revisit to the problem when I have spare time.