Sorry, I know very little on the GL side, however if you're pulling frames back to the ARM then by default VCHIQ is DMA copying every frame. I'm not sure how well VCHIQ deals with the threading in those situations.
There is a zero copy option in MMAL that uses the vc-sm kernel driver to map the GPU memory allocation into ARM address space, thereby removing the need for that copy.
Call "mmal_port_parameter_set_boolean(camera->output, MMAL_PARAMETER_ZERO_COPY, MMAL_TRUE)" before enabling the port or creating the pool of buffers, and ensure you use "mmal_port_pool_create(camera->output, num, size)" from mmal_util.h (not mmal_pool_create(num, size)), and the magic should just work. Buffer handles and address mapping should all be automagic.
Shout if things don't work and I'll try to help. raspiraw
is using zero copy if you want an example.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
Please don't send PMs asking for support - use the forum.
I'm not interested in doing contracts for bespoke functionality - please don't ask.