Akronoz
Posts: 2
Joined: Mon Jul 04, 2016 7:34 am

Minimal latency for pixel data?

Mon Jul 04, 2016 8:39 am

Hi, im working with the camera module, and I observe that there is some considerable latency between the sutter takes the frame, and I can acces it from memory.

I dont know exactly whats the pipeline from camera to the point you can procces the pixel data.

Camera ---> CSI ----> GPU ---> ¿Bayer to YUV? ---> ¿H264? ---> ¿CPU? ----> ¿memory data?

I dont need any color conversion, compression, but the most important thing on my project is minimize the time between sutter and the avaliable data anywhere I can get it first CPU/GPU.

What is the strategy to acomplish that?

I should use MMAL? Which is the configuration to achieve low latency? I should putmy algotithm on CPU side, to avoid data transmision to de GPU? The image is first on GPU and the best way is place there the algorithm? Which are the things to take in acount to avoid latest frame be dropped and get the previous one.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23076
Joined: Sat Jul 30, 2011 7:41 pm

Re: Minimal latency for pixel data?

Mon Jul 04, 2016 10:42 am

The pipeline is complex, but you can ignore most of it as its realtime.

(Camera->CSI->Unicam peripheral->ISP (25 stages of processing)->encoder->memory)

The largest delay is actually starting up the camera from scratch. This can take at least 500ms, then you need some frames to pass through to get the white balance and gains correct. There is also changing modes - if you are running preview, that needs to stop, then start up stills capture mode, then return to preview. All takes time.

There are loads of posts on here of ways to improve these times - already have the camera going etc, and there are some options to avoid too much statistics gathering.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7008
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: Minimal latency for pixel data?

Mon Jul 04, 2016 11:16 am

Define considerable, and how are you measuring the latency?

It should be <100ms if H264 encoding, or less if reading the raw pixels. If H264 your app MUST keep up with the data produced or an internal 2MB FIFO get fuller and you'll see apparent latency.

Please remember that the sensor has a rolling shutter, so the first line of the frame is exposed about 30ms before the last line.
Approx timeline for normal preview:
- At the end of the exposure time for the first line, it will be sent to the SoC. Lines then follow at the rate set up by the sensor mode.
- Assuming not transposing, when around 200 lines have been received, the ISP will start processing the data.
- At 1/(max framerate) seconds later the last line will be received. If in the 1296x972 2x2 binned sensor mode, that can go up to 42fps, so 23.8ms. 1080P would be 33ms as it can only go at 30fps.
- Within <10ms of the last line being received, the output of that line will be completed, finishing the frame.
- If asking for YUV data, that should be available to your app within a couple of milliseconds of then. If asking for RGB data it will take longer as there is a more involved image conversion required.
- If asking for H264 data, encoding the frame should take around 40ms for a 1080P frame, less for smaller frames. The encoded data will be available to your app within a couple of milliseconds of the encode completing.

Each MMAL buffer has a PTS value that is the GPU STC (System Time Clock) when it received the start of frame interrupt, assuming the use_stc_timestamp mode is set to MMAL_PARAM_TIMESTAMP_MODE_RAW_STC. That same value can be read back with mmal_port_parameter_get_uint64(port, MMAL_PARAMETER_SYSTEM_TIME, &uint64_value);

I've just made a couple of quick mods to raspividyuv and raspivid to read that value during the callback and print the difference. Sorry, I can't post the diffs at the moment as my Pi has no network connection.
Using "raspividyuv -w 1280 -h 960 -o /dev/null -n" I get a latency of 34-35ms. Switching to 1080P I get 50-52ms.
Using "raspivid -w 1280 -h 960 -o /dev/null -n" I get a latency of 46-48ms. Switching to 1080P I get 80-81ms.
(I'm amused. I wrote the above post before I'd run the test - it's nice when the numbers fall out almost exactly as predicted. The transfer to the host is obviously taking longer than I expected for the 108P frame).
Last edited by 6by9 on Tue Jul 05, 2016 3:38 pm, edited 1 time in total.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Akronoz
Posts: 2
Joined: Mon Jul 04, 2016 7:34 am

Re: Minimal latency for pixel data?

Tue Jul 05, 2016 3:11 pm

Hey, thank you for your fast answers!

I was able to get the diference between the buffer->pts and MMAL_PARAMETER_SYSTEM_TIME, and got similar results than you

Code: Select all

./raspividyuv -o /dev/null -w 1280 -h 960 -n
Frame time 33318        Latency in us: 38753
Frame time 33313        Latency in us: 40249
Frame time 33317        Latency in us: 34841
Frame time 33315        Latency in us: 34084
Frame time 33316        Latency in us: 35081
Frame time 33315        Latency in us: 35115

./raspividyuv -o /dev/null -w 1280 -h 960 -fps 42 -n
Frame time 23796        Latency in us: 34844
Frame time 23797        Latency in us: 39410
Frame time 23797        Latency in us: 37617
Frame time 23797        Latency in us: 38492
Frame time 23797        Latency in us: 37158
Frame time 23796        Latency in us: 38701
First I observed that latency varies around 5ms, is that caused by ISP or by Linux scheduling?.
From my first question about CPU/GPU side, when camera_buffer_callback its called and I have acces to *buffer, the data was copied to CPU memory? Its posible to acces it via GPU/OpenGL before that copy to get better latency?

And there is a way to acces the pipeline earlier, for example, getting directly the bayer data before it is converted to pixels by ISP, or when each line was send to the ISP to apply my algorithm line by line on the fly before the frame its completed? or its not possible because all its done by hardware automatically.

Thanks!

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7008
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: Minimal latency for pixel data?

Tue Jul 05, 2016 3:37 pm

Akronoz wrote:First I observed that latency varies around 5ms, is that caused by ISP or by Linux scheduling?.
Probably a combination of the two. There is contention on the VPU within the GPU for some image processing and the format conversion, and there is also some jitter from the Linux side. The sensor itself is standalone and will just keep on clocking out the frames at the rate determined by the configuration - that's about the only given.
Akronoz wrote:From my first question about CPU/GPU side, when camera_buffer_callback its called and I have acces to *buffer, the data was copied to CPU memory? Its posible to acces it via GPU/OpenGL before that copy to get better latency?
You can improve timings slightly by adding "mmal_port_parameter_set(camera_video_port, MMAL_PARAMETER_ZERO_COPY, MMAL_TRUE)" BEFORE the port is enabled and the buffer pool is allocated. That maps the GPUs buffer into ARM memory and so saves a memcpy of the frame.
PLEASE do a "sudo rpi-update" before trying that one, or sync and build the latest userland tree. There was a deadlock situation discovered in the kernel when using zero copy and that was worked around with https://github.com/raspberrypi/userland ... 2838fc250c
OpenGL isn't going to help - all the framework stuff for that is frame based.
Akronoz wrote:And there is a way to acces the pipeline earlier, for example, getting directly the bayer data before it is converted to pixels by ISP, or when each line was send to the ISP to apply my algorithm line by line on the fly before the frame its completed? or its not possible because all its done by hardware automatically.
Whilst the firmware could, and I think there was even a software stage developed to remote algorithms to the ARM, that isn't an option available on Pi.
All the work being done to expose the raw Bayer data via V4L2 won't help much either as it will all be frame based, so you won't save much time. You also then lose the ISP and all the AE/AWB control loops.

If you felt like a world of pain, then if you switched to using OpenMaxIL instead of MMAL you can specify a value of nSliceHeight on each port. This means you can deliver the output buffer as stripes instead of whole frames. If you specify that as eg nFrameHeight/8 (although it has to be a multiple of 16) and nBufferCountActual = 8, then you'd get 8 buffers delivered per frame with each coming out as soon as they had been produced.
IL is supported on Pi and a few people have used it quite happily, however it is not a nice API and has lots of nasty gotchas, so we'd recommend you stick with MMAL unless it really gives you something extra. Please do be aware though that there is no zero copy option with IL - the copy from GPU to ARM memory is mandatory.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

nkumar3119
Posts: 2
Joined: Tue Aug 16, 2016 6:12 pm

Re: Minimal latency for pixel data?

Mon Aug 22, 2016 3:17 pm

Hi, I am glad I came across this post. I am also trying to get the latency of data from sensor to user app. I am using RapiVidYUV.c as a starting point for this and placed: mmal_port_parameter_get_uint64(port, MMAL_PARAMETER_SYSTEM_TIME, &uint64_value) in camera_buffer_callback(..) to measure latency. For frame time, I used buffer->pts. Here is what I get:

Code: Select all

./raspividyuv -w 1280 -h 720 -o /dev/null -n
frametime=33328, latency = 31424
frametime=33327, latency = 34314
frametime=33327, latency = 30769
frametime=33328, latency = 32985
frametime=33327, latency = 33237
How to explain the latencies less than 33ms? Am I missing something?
Thank you.

Return to “Camera board”