I have set up the video capture to capture at 640x480, at 90 fps.
I also use the h264 encoder.
The "preview" output is set to send YUV420 data to the ARM side. I don't need screen preview, but I need the data for analysis, and this seems more efficient than using a splitter like RaspiVid does.
Now, I add a GLES2 GUI. It runs a simple loop of clear screen, generate and draw a few things using vertex buffer objects, and swap buffers.
This GLES2 GUI runs at a handful of frames per second. About 5 or so.
When I don't start the video capture pipeline, the same GUI runs at a solid 60 fps (which is the refresh rate of my 1024x600 7" HDMI display.)
Timing the different things going on, it turns out that almost every GL call takes dozens up to hundreds of milliseconds.
If the VideoCore was somehow fill rate limited because of the video capture, I'd expect the swapbuffers call to take all the time, but that's not what's happening.
Instead, it seems like some kind of handshake with the videocore is what's going on. Removing any call to glGetError() increases the frame rate from 4-5 to 10-12 fps, so that's a bit of the problem, but it's not enough to explain all of it.
What could be going on? gprof doesn't show me where the delays are, and CPU/top does not show a pegged core, so it's some kind of external event it's waiting for.
How can I diagnose and debug this?
