Page 1 of 1

GLES gets super slow when also running video capture

Posted: Thu Oct 19, 2017 4:38 am
by jwatte
I have set up the video capture to capture at 640x480, at 90 fps.
I also use the h264 encoder.
The "preview" output is set to send YUV420 data to the ARM side. I don't need screen preview, but I need the data for analysis, and this seems more efficient than using a splitter like RaspiVid does.

Now, I add a GLES2 GUI. It runs a simple loop of clear screen, generate and draw a few things using vertex buffer objects, and swap buffers.
This GLES2 GUI runs at a handful of frames per second. About 5 or so.
When I don't start the video capture pipeline, the same GUI runs at a solid 60 fps (which is the refresh rate of my 1024x600 7" HDMI display.)

Timing the different things going on, it turns out that almost every GL call takes dozens up to hundreds of milliseconds.
If the VideoCore was somehow fill rate limited because of the video capture, I'd expect the swapbuffers call to take all the time, but that's not what's happening.
Instead, it seems like some kind of handshake with the videocore is what's going on. Removing any call to glGetError() increases the frame rate from 4-5 to 10-12 fps, so that's a bit of the problem, but it's not enough to explain all of it.

What could be going on? gprof doesn't show me where the delays are, and CPU/top does not show a pegged core, so it's some kind of external event it's waiting for.
How can I diagnose and debug this?

Re: GLES gets super slow when also running video capture

Posted: Sun Oct 22, 2017 3:00 pm
by jwatte
It is very clear -- when the VideoCore gets busy, the OpenGL layer starts taking a long time for each command.
Is there some way to batch up all commands in the stream and let them go at the end, so I don't have to wait for VC turn-around more than once per frame?
Also, is there any documentation on which bits actually synchronize with the VC and which GL functions are pure client-side?

Re: GLES gets super slow when also running video capture

Posted: Tue Oct 31, 2017 4:51 pm
by peepo
I had 90 fps okay, code is a little old, but I get regular infrequent reports...

https://github.com/peepo/openGL-RPi-tutorial

the last few tutorials may be enough

best

Re: GLES gets super slow when also running video capture

Posted: Tue Oct 31, 2017 4:53 pm
by jwatte
This is not at all doing what I'm doing, so I don't understand how it would help?
Could you be a little more specific in what you think the problem is, and how this link helps illustrate a possible solution?

Re: GLES gets super slow when also running video capture

Posted: Tue Oct 31, 2017 5:04 pm
by 6by9
Sorry, I know very little on the GL side, however if you're pulling frames back to the ARM then by default VCHIQ is DMA copying every frame. I'm not sure how well VCHIQ deals with the threading in those situations.

There is a zero copy option in MMAL that uses the vc-sm kernel driver to map the GPU memory allocation into ARM address space, thereby removing the need for that copy.
Call "mmal_port_parameter_set_boolean(camera->output[0], MMAL_PARAMETER_ZERO_COPY, MMAL_TRUE)" before enabling the port or creating the pool of buffers, and ensure you use "mmal_port_pool_create(camera->output[0], num, size)" from mmal_util.h (not mmal_pool_create(num, size)), and the magic should just work. Buffer handles and address mapping should all be automagic.

Shout if things don't work and I'll try to help. raspiraw is using zero copy if you want an example.

Re: GLES gets super slow when also running video capture

Posted: Tue Oct 31, 2017 6:25 pm
by jwatte
Thanks, that sounds like something I should try!

Re: GLES gets super slow when also running video capture

Posted: Fri Nov 24, 2017 2:11 am
by jwatte
I finally got to measure this change, and it made a huge difference! Awesome!
Thank you, @6by9.