Lich
Posts: 22
Joined: Wed Apr 11, 2018 8:41 pm
Location: Russia

How to optimize frame grabbing from video stream in OpenCV?

Sun Jul 07, 2019 6:59 pm

Hello everyone,

I ran into a problem problem of low frame capture efficiency in OpenCV.

1. Hardware & Software.

Raspberry Pi 3 (1,2 GHz quad-core ARM) with HDMI Display
IP camera: LAN connected, RTSP, H264 codec, 1280x720 resolution, 20 fps, 1 GOP, 2500 kB/s VBR bitrate (parameters can be changed).
OS Raspbian Stretch
Python 3.5
OpenCV 4.1
Gstreamer 1.0

2. Task.

Get videostream from IP camera, recognize images and display resulting video (with marks and messages).
Important features: real-time processing, HD resolution (1280x720), high frame rate (>20 fps), continuous operation for several hours.

3. My solution.

General algorithm: source video stream -> decoding and frame grabbing -> work with frames in OpenCV -> assembling the processed frames into a video stream -> display video using a Raspberry Pi GPU

OpenCV output/display method - imshow - does not work well even at low-resolution video. The only library that allows to use a Raspberry Pi GPU to decode and display video is a Gstreamer.
I compiled Gstreamer modules (gstreamer1.0-plugins-bad, gstreamer1.0-omx) with OMX support and tested it:

Code: Select all

gst-launch-1.0 rtspsrc location='rtsp://web_camera_ip' latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! glimagesink
It works great, CPU usage is about 9%.

Next I compiled OpenCV with Gstreamer, NEON, VFPV3 support.


I use the following code for testing:

Code: Select all

import cv2
import numpy as np
 
src='rtsp://web_camera_ip'
stream_in = cv2.VideoCapture(src)
 
pipeline_out = "appsrc ! videoconvert ! video/x-raw, framerate=20/1, format=RGBA ! glimagesink sync=false"
fourcc = cv2.VideoWriter_fourcc(*'H264')
 
stream_out = cv2.VideoWriter(pipeline_out, cv2.CAP_GSTREAMER, 0, 20.0, (1280,720))
while True:
    ret, frame = stream_out.read()
    if ret:
      stream_out.write(frame)
      cv2.waitKey(1)
It also worked, but not so well as Gstreamer itself. CPU usage is about 50%, without stream_out.write(frame) - 35%. At frame rate above 15, there are lags and delays.

4. How I tried to solve the problem.

4.1. Use Gstreamer to decode video stream:

Code: Select all

pipline_in='rtspsrc location=rtsp://web_camera_ip latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! appsink'
stream_in = cv2.VideoCapture(pipline_in)
It even worsened the situation - the CPU load increased by several percent, the delay has become more.

4.2. I also tried to optimize the library using method from PyImageSearch.com - threading using WebcamVideoStream from imutils library.

Code: Select all

from threading import Thread
import cv2
import numpy as np
import imutils
 
src='rtsp://web_camera_ip'
stream_in = WebcamVideoStream(src).start()
pipeline_out = "appsrc ! videoconvert ! video/x-raw, framerate=20/1, format=RGBA ! glimagesink sync=false"
fourcc = cv2.VideoWriter_fourcc(*'H264')
 
stream_out = cv2.VideoWriter(pipeline_out, cv2.CAP_GSTREAMER, fourcc, 20.0, (1280,720))
while True:
    frame = stream_in.read()
    out.write(frame)
    cv2.waitKey(1)
CPU usage has increased to 70%, the quality of the output video stream has not changed.

4.3 Сhanging the following parameters does not help: whaitKey(1-50), videostream bitrate (1000-5000 kB/s), videostream GOP (1-20).

5. Questions.

As I understand, VideoCaputre/Videowritter methods has a very low efficiency. Maybe it's not noticeable on PC, but it is critical for Raspberry Pi 3.

Is it possible to increase the performance of the VideoCaputre (Videowritter)?
Is there an alternative way to capture frames from video to OpenCV?

Thanks in advance for answers!

Lich
Posts: 22
Joined: Wed Apr 11, 2018 8:41 pm
Location: Russia

Re: How to optimize frame grabbing from video stream in OpenCV?

Mon Jul 08, 2019 9:57 pm

I think I know what the problem is, but I don't know how to solve it.

1. Refinement about CPU usage when working with VideoCapture and VideoCapture+Gstreamer. VideoCapture(src)+VideoWriter(gstreamer_piplene_out) - 50-60%, VideoCapture(gstreamer_pipline_in) +VideoWriter(gstreamer_piplene_out) - 40-50%.
2. Сolor formats that different parts of my program work with. H264 video stream - YUV, OpenCV - BGR, OMX layer output - RGBA. OpenCV can work only with frames in BGR color format. OMX layer output when trying to launch the collected video in a different color format, displays a black screen.
3. Сolor format conversion in the Gstremaer pipline is carried out using the videoconvert. In some cases the method can work automatically (without specifying parameters), it is also possible to specify the color format forcibly. And I do not know how it works in the "pure" VideoCapture(src).

The main problem is that videoconvert does not support GPU - the main CPU load is due to the color format conversion!

I tested this assumption using the "pure" Gstreamer, adding the videoconvert:

Code: Select all

gst-launch-1.0 rtspsrc location='web_camera_ip' latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! video/x-raw, format=BGR ! glimagesink sync=false
Black display, CPU load is 25%.

Check this pipline:

Code: Select all

gst-launch-1.0 rtspsrc location='web_camera_ip' latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! video/x-raw, format=RGBA ! glimagesink sync=false
Video is displayed, CPU load is 5%. I also assume that the omxh264dec converts the color format YUV to RGBA using GPU (after omxh264dec, videoconver does not load the CPU).

4. I don't know how to use GPU for color format conversion in VideoCapture/Gstreamer on Raspberry.

In this thread 6by9, Rapberry engineer and graphics programming specialist, writes that "The IL video_encode component supports OMX_COLOR_Format24bitBGR888 which I seem to recall maps to OpenCV's RGB".


Are there any ideas? If I made a mistake in the reasoning, please correct me. Solving this problem will open up the fantastic possibilities of the Raspberry Pi!

invisible999
Posts: 8
Joined: Sun Oct 28, 2018 4:14 pm

Re: How to optimize frame grabbing from video stream in OpenCV?

Wed Nov 27, 2019 9:05 am

Hello,

Did you find any solution for your problem?

I am in the beginning stages of tackling something similar. In my case I want to use PI Zero to send video stream to Jetson Nano which will do exactly the same - stream analysis, face recognition, etc.

I've used raspivid and cvlc to send out rtsp stream, here is the command:

Code: Select all

raspivid -o - -t 0 -w 960 -h 720 -fps 24 -b 250000 -g 1 | cvlc -vvv stream:///dev/stdin --sout '#rtp{access=udp,sdp=rtsp://:8554/stream}' :demux=h264
This actually works, but it has very significant delay - like 2-5 seconds delay when the stream is played at VLC player and I see that bandwidth of the stream is like 4-8mbit with 3-5% of frame losses.

I would like to reduce latency as much as possible, so should I move to gstreamer instead?

Lich
Posts: 22
Joined: Wed Apr 11, 2018 8:41 pm
Location: Russia

Re: How to optimize frame grabbing from video stream in OpenCV?

Sun Dec 01, 2019 5:23 pm

Hello, invisible999!

The fastest way to capture video frames in OpenCV is to use Gstreamer (you must first compile OpenCV with Gstreamer support). It is necessary to use v4l2h264dec and v4l2video12convert (v4l2convert in Gstreamer 14) methods to decode and convert color format respectively. These methods support hardware acceleration on Raspberry Pi 3/4.

Code: Select all

pipline_r = 'rtspsrc location=rtsp://web_camera_ip latency=100 ! rtph264depay ! h264parse ! v4l2h264dec capture-io-mode=4 ! v4l2video12convert output-io-mode=5 capture-io-mode=4 ! appsink sync=false'
vs = cv2.VideoCapture(pipline_r)

while True:
    ret, frame = vs.read()
    if ret:
      cv2.waitKey(1)

Read more in this topic: https://www.raspberrypi.org/forums/view ... 7#p1504977


In my project, there was no need to use sending RTSP / UDP video stream, but I tested this option in Gstreamer.


In Gstreamer 14, the v4l2h264enc encoding method works with errors; omxh264enc must be used.

Code: Select all

gst-launch-1.0 -v filesrc location=file_location ! qtdemux ! h264parse ! v4l2h264dec ! omxh264enc ! rtph264pay ! udpsink host=localhost port=9001


And yes, I am also testing Jetson Nano for video processing and object recognition, because Raspberry Pi CPU/GPU power is not enough for HD / 25 fps video (more precisely, the problem is in the GPU drivers). However, I will try to develop my project for two platforms.

invisible999
Posts: 8
Joined: Sun Oct 28, 2018 4:14 pm

Re: How to optimize frame grabbing from video stream in OpenCV?

Mon Dec 02, 2019 5:10 am

Hello Lich,

First of all, thanks for replying and providing additional information. Do you document your progress with your project somewhere with more details? If so - would you mind to provide it?

When you said that performance of PI is not enough, what usage/case do you mean? After writing my message I did come up to couple of posts about gstreamer and its usage of GPU - this page was quite useful to me https://www.accuware.com/support/dragon ... streaming/. I did try to stream out RTSP using the method described in above page and with 960/720 resolution with 30fps CPU load on PI Zero does not exceed 25% with 1-2.5mbit bandwith usage for RTSP stream.

I am still trying to understand how what are relationships of certain components on PI (and Jetson as well) - gstreamer, its options, gpu and driver usage and what need to be done to use them efficiently. Would you recommend where to read more up to the date information about this subject?

I didn't try to do any image classification/recognition on PI itself because I bought Jetson for that purposes, but there are bunch of things which I do not understand or are not properly documented - what frameworks/tools use to process stream from PI, how to enable proper and efficient usage of GPU on Jetson, what would be expected performance/fps and etc, etc.

You seems have wealth of information, I would love to ask more questions, we can do private messaging if you prefer and I can do по русский as well.

Lich
Posts: 22
Joined: Wed Apr 11, 2018 8:41 pm
Location: Russia

Re: How to optimize frame grabbing from video stream in OpenCV?

Mon Dec 02, 2019 9:27 pm

invisible999,

1. I try to document my project, but it is at the initial stage of development. This is my hobby, and I started development a year ago, without any serious knowledge of Raspberry or computer vision. In short, my system should perform the following actions: receive a video stream from an IP camera, process it (to improve visibility), recognize certain objects, display warnings. In parallel, the system controls the operation of the camera (zoom in the first place) and also interacts with some other devices and sensors. It is supposed to work with the Onvif, Pelco (over RS-485), GPIO. I use Python, OpenCV, Gstreamer. Later, I plan to publish a description of my project. I can try to answer your questions if I have already solved a similar problem as part of my project.

2. Probably, the RPI performance is quite enough to stream video from CSI camera (in this case the hardware stream reading is used). Displaying HD video using OpenCV (with OpenGL) loads the CPU by 30%. Recognition of objects at the same time loads the system by more than 60% and makes it impossible to output video in real time. Unfortunately, the RPI GPU does not support OpenCL and obviously CUDA, which allows you to speed up neural networks and image processing. It may be possible to speed up the recognition process by skipping a few frames.

3. I collect information in small parts throughout the Internet. One of the most interesting resources about computer vision, OpenCV and Python is https://www.pyimagesearch.com

4. Yes, you can write me a private message, but I do not often visit this forum.

Return to “Graphics programming”