terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

H264 Decoder Latency

Thu Mar 14, 2019 9:24 am

Hi all,

So some time back I did a lot of research on this and implemented an H264 decoder using IL.
The basic flow is as follows:

PC:
1. generate frame
2. encode frame to H264
3. send frame in custom format over UDP with additional signalling and checksums to PI.

This whole process I spent a lot of time refining and optimising and it now takes about 8ms in total including network transmission, packet management and H264 h/w accelerated encoding on the PC.

With this in place and the first trivial implementation of the decoder on the PI I was already able to get 60FPS, but had terrible latency.
The display was running +- 200-300ms behind the encode.

I then went through a series of changes to try solve the latency issue:
1. Remove clock components from IL, decode -> render.
2. Ensure the encoder side is generating a low-latency stream using max_dec_buffer=1, no out-of-order frames, no B/P frames etc.

At this point the latency was about 100ms.
Further to this:

3. Add additional configuration settings to the decoder:

Code: Select all


if (OMX_SetConfig(ILC_GET_HANDLE(video_render), OMX_IndexConfigDisplayRegion, &configDisplay) != OMX_ErrorNone)

auto result = OMX_SetParameter(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamBrcmExtraBuffers, &maxCB);

result = OMX_SetConfig(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamBrcmLazyImagePoolDestroy, &lazyB);

// Set it as OMX_DataUnitCodedPicture and OMX_DataEncapsulationRtpPayload (tried all the combos)
if (OMX_SetConfig(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamBrcmDataUnit, &c) != OMX_ErrorNone)

Also ensure the flags are set to mark each end-of-frame with:
this->buf->nFlags = OMX_BUFFERFLAG_ENDOFFRAME | OMX_BUFFERFLAG_ENDOFNAL;
As we know that complete frames are delivered when the decoder runs.

After all this latency was between 80-100ms

I then found a strange issue, which I've been able to re-produce consistently and across different fully updated RPI3b+'s
If the gpu_mem split is set at 256mb, I have the above results and sometimes get a lot of frame corruption, even though the frame data itself is totally valid (I verify all the checksums on the packets as ok and there are no dropped packets -> The stream is using about 6Mb/s).
IF howeve I change the gpu_mem split to 128mb, the performance is notably better and I can now achieve latency between 40-80ms (it seems to fluctuate within this range.. which is about 3-5 frames).

I have in other work noted that the exact opposite is true of the gpu_mem split when it comes to using

Code: Select all

vc_dispmanx_resource_write_data(this->resource, this->image.type, this->image.pitch, this->image.buffer, &(this->bmpRect));
That particular call is twice as fast when the gpu_mem split is 256mb

I would expect, given that the frame data is fully received and complete to be able to achieve latency of no more than 1 frame (IE: immediate presentation).. possibly 2 frames (30ms latency max).

What am I missing? Are the any other tricks that can be applied for il client to get it to present immediately and reduce this latency further?
I found this in some online docs, but it doesn't seem to be present in the headers :
OMX_IndexConfigBrcmVideoH264LowLatency
and I'm not sure if that applies to decode or only encode or if it just doesn't exist.

Any help would be appreciated!
Thanks
John

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7124
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: H264 Decoder Latency

Thu Mar 14, 2019 11:40 am

With regard gpu_mem settings I couldn't say for certain, but I suspect you're triggering compaction of the heap.

gpu_mem is configuring what is known internally as the relocatable heap. All objects allocated in there are referenced by a handle that has to be locked in order to get an address. Buffers are unlocked as soon as they are no longer being directly accessed.
Should an allocation failure occur, then all unlocked blocks are shuffled around to try and make space for the allocation. This obviously takes time, but shouldn't happen that frequently (buffer allocation only, so when the component/port is enabled).
The codec block has a very annoying restriction where the hardware designers saved addresss 2 bits and therefore requires all reference frames to come from the same 256MB block. The relevant allocations should be marked as having to come from the same 256MB, and that constraint should be respected (it's a bug if it isn't). If it isn't, then you'll get a load of asserts logged if you add "start_debug=1" to /boot/config.txt and run "sudo vcdbg log assert" during or after running the decode.

I have no justification as to why dispmanx behaves differently with 128 vs 256MB.

- P-frames should make zero difference to the decode latency but will increase the encoding efficiency vastly. Avoid B-frames if you want low latency.
- OMX_IndexConfigDisplayRegion is only changing where the image is displayed.
- OMX_IndexParamBrcmExtraBuffers adds extra buffers to the pool. That shouldn't make any difference other than increasing memory usage.
- OMX_IndexParamBrcmLazyImagePoolDestroy will have no effect.
- You don't provide the value that you set OMX_IndexParamBrcmDataUnit to in code, only comment. OMX_DataUnitCodedPicture and OMX_DataUnitArbitraryStreamSection are the only supported modes, as documented in http://www.jvcref.com/files/PI/document ... ecode.html. OMX_DataUnitCodedPicture requires that OMX_BUFFERFLAG_ENDOFFRAME is provided on the relevant frame, and it will actually be automatically selected as the input mode should OMX_BUFFERFLAG_ENDOFFRAME be seen.

OMX_IndexConfigBrcmVideoH264LowLatency is encode only - see http://www.jvcref.com/files/PI/document ... ncode.html. I don't believe it has ever been used on the Pi, and I don't believe the associated infrastructure is in place to support it.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Thu Mar 14, 2019 12:25 pm

Hi,

Thanks for the info.

I'm currently using:

Code: Select all

	OMX_PARAM_DATAUNITTYPE c;
		c.eUnitType = OMX_DataUnitCodedPicture;
		c.eEncapsulationType = OMX_DataEncapsulationRtpPayload;
		c.nPortIndex = 130;
		c.nVersion.nVersion = OMX_VERSION;
		c.nSize = sizeof(OMX_PARAM_DATAUNITTYPE);
		if (OMX_SetConfig(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamBrcmDataUnit, &c) != OMX_ErrorNone)
I've tried all the various encapsulation options to see if it would make any difference.
I'm going to test for the asserts shortly and will let you know.

Can confirm that the lowlatency option doesn't exist anywhere, so clearly never implemented on the PI and as you say, encode only.
Hopefully the asserts will reveal something else.

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Thu Mar 14, 2019 12:35 pm

All the vcdbg commands work.. except log assert

I get

Code: Select all

001765.894: assert( source ) failed; ../../../../../middleware/confzilla/cp_front_fdt.c::cp_front_fdt_load_builtin line 115 rev ed5baf9
vcdbg_ctx_get_dump_stack: dump_stack failed
tried during decoding and after.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7124
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: H264 Decoder Latency

Thu Mar 14, 2019 12:59 pm

terraspace wrote:
Thu Mar 14, 2019 12:35 pm
All the vcdbg commands work.. except log assert

I get

Code: Select all

001765.894: assert( source ) failed; ../../../../../middleware/confzilla/cp_front_fdt.c::cp_front_fdt_load_builtin line 115 rev ed5baf9
vcdbg_ctx_get_dump_stack: dump_stack failed
tried during decoding and after.
OK, it's not the codec paging funny then. That one harmless assert is always there on boot, which also acts as a useful telltale that the assert logging is working.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Thu Mar 14, 2019 1:53 pm

Ahh ok.. it looked suspicious :)

Something funky definitely happens when switching between 128 and 256 on the gpu to dispmanx and il, and the result seems to be the exact opposite.

Anyway, I'm still trying to find any way to reduce this latency.. It's not bad, but i'm sure it must be possible to achieve around 16-32ms.
I don't believe any claim I've read of less than that end-to-end as I don't see how you could ever be less than 1 frame behind at least, and depending on when the frame arrives and is decoded lining up with vsync or other timing, it could quite easy miss and result in 2 frame lag.

The only other thing I could think is that I'm receiving frames too fast ..
So perhaps instead of signalling to receive the next frame once the buffer has been decoded .. ie after the whole

Code: Select all

			
			if (OMX_EmptyThisBuffer(ILC_GET_HANDLE(this->video_decode), this->buf) != OMX_ErrorNone)
			{
				this->status = -6;
				break;
			}
			if (dataSize <= 0)
				break;
		}
		this->frameCount++;
part, we should only request one on the following vertical blank..

Another thing I've noticed is that for the first several seconds when the decode->render starts there is a lot of lag/skipped frames before it settles down.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7124
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: H264 Decoder Latency

Thu Mar 14, 2019 2:44 pm

When passed the first buffer then there is a load of setup required in allocating appropriately sized buffers etc. It can then decode slightly faster than realtime (depending on resolution and frame rate fed in), so catches up to as the data is presented.

Your incoming data is not synchronised to your display, therefore trying to tie the 2 together is going to be a recipe for disaster.

video_decode will take a frame and should decode it at roughly 244800 macroblocks/sec (1080P30). I know for encode that the process is pipelined as CABAC is independent of motion compensation/reconstruction etc, so I wouldn't be surprised if the same was true for decode. Encode is ~40ms latency for a 1080P frame.

Once decoded the frame is presented to video_render, which will update the display list at the next vsync.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Thu Mar 14, 2019 3:10 pm

Yep, presently it just receives, decodes and presents as fast as possible which is why I'm wondering if maybe that is too fast.
The decode is definitely running at +-60fps (for 1080p stream), so I guess the decode must run faster than encoding then otherwise I'd be getting incrementally worse lag or dropped frames which isn't happening.

I think it will be easiest to manage the rate from the client side, as it knows the capture rate .. so it could say send-frame ONLY if >= 16ms since last frame sent and ACK recvd (The ack round trip from initial frame send bring us to about 11-13ms). That would prevent the PI from receiving frames faster than it can push them out, and so doing the vsync issue goes away as there will still only be 1 frame available at any given point during the vertical display/vsync.

I thought the initial lag/frame skipping might be due to setup.. so that is fine it doesn't bother me.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7124
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: H264 Decoder Latency

Thu Mar 14, 2019 3:19 pm

If the stream says it is level 4.2 or higher (> 1080P30) then the codec block gets automatically overclocked to 300MHz (standard is 250MHz).
We generally can keep up with 1080P60 clips if the bitrate isn't too extreme, and there isn't too much other system activity (it's generally memory bandwidth bound). My comment was more that the latency is not necessarily less than a frame time as H264 can be pipelined.

Encode can achieve 1080P60 with an overclock and a favourable wind.

You did earlier mention you'd dropped all P and B frames, but if that is no longer the case then please remember to drop frames at the input to your encoder. You can not drop arbitrary frames in an H264 stream as they may be used as reference frames for subsequent frames.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Thu Mar 14, 2019 3:39 pm

Yep, I don't manually remove them it's handled by the encoder automatically under low-latency settings to remove the B frames.

I realised I've already built in the frame-rate limiting on the client side.. doh.. going to double check that and make sure it absolutely isn't sending more than 60fps.

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Thu Mar 14, 2019 5:08 pm

Ok, double checked the sending throttle and that is perfect as far as I can tell. It uses a Semaphore, waiting for the ACK, if that returns in under 16.66ms it nanosleeps the duration, so it's definitely not sending more than 60fps. It also throttles the startup, so sec 1 10fps, sec 2 20fps and so on to give the encoder and decoder time to settle. If it can't get 60fps (>16.66ms) but the client is achieving higher it reduces the bitrate, if it's well under 16ms it increases the bitrate so that it should be adaptive quality depending on the network.

Still +-40ms latency ... argh...

ktb92677
Posts: 28
Joined: Fri Sep 20, 2013 10:29 pm

Re: H264 Decoder Latency

Mon Apr 01, 2019 9:49 pm

@terraspace
I am experiencing similar issues to what you are. Is there any chance you could post your ilclient code so I can take a look at it?

terraspace
Posts: 76
Joined: Mon Dec 03, 2018 3:56 pm

Re: H264 Decoder Latency

Fri May 17, 2019 6:00 pm

I can, but it's part of a much larger code-base and the il portion is quite small.. it depends on the custom UDP packet streamer and a ton of other stuff.. and I've wondered if the issue is purely inside the IL portion or perhaps to do with other aspects of the streaming.

Code: Select all


#pragma GCC diagnostic ignored "-Wwrite-strings"
#pragma GCC diagnostic ignored "-Wpmf-conversions"

#include "Decoder.h"
#include "math.h"
#include "Checksum.h"
#include "Config.h"

#define min(x, y) ((x)<(y)?(x):(y))

namespace PTCore {

	/* Initialize the Decode for Stream Playback */
	void Decoder::Init(ILCLIENT_T* ilclient, bool toTestFile)
	{
		if (toTestFile)
		{
			this->out = fopen("encoder2.dat", "wb");
			if (this->out == NULL)
				return;
		}

		int ret = 0;
		memset(this->list, 0, sizeof(this->list));
		memset(this->tunnel, 0, sizeof(this->tunnel));
		this->client = ilclient;
		
		// create video decoder
		ret = ilclient_create_component(this->client, &this->video_decode, (char*)"video_decode", (ILCLIENT_CREATE_FLAGS_T)(ILCLIENT_DISABLE_ALL_PORTS | ILCLIENT_ENABLE_INPUT_BUFFERS));
		if (ret != 0)
		{
			std::cout << "Failed to create video decode component";
			return;
		}
		this->list[0] = this->video_decode;

		// create video renderer
		ret = ilclient_create_component(this->client, &this->video_render, (char*)"video_render", ILCLIENT_DISABLE_ALL_PORTS);
		if (ret != 0)
		{
			std::cout << "Failed to create video render component";
			return;
		}
		this->list[1] = this->video_render;
		
		OMX_DISPLAYRECTTYPE dest_rect;
		dest_rect.x_offset = 0;
		dest_rect.y_offset = 0;
		dest_rect.width = (OMX_S16)devConfig.width;
		dest_rect.height = (OMX_S16)devConfig.height;

		OMX_CONFIG_DISPLAYREGIONTYPE configDisplay;
		memset(&configDisplay, 0, sizeof configDisplay);
		configDisplay.nSize = sizeof configDisplay;
		configDisplay.nVersion.nVersion = OMX_VERSION;
		configDisplay.nPortIndex = 90;

		configDisplay.set = (OMX_DISPLAYSETTYPE)(OMX_DISPLAY_SET_TRANSFORM | OMX_DISPLAY_SET_LAYER | OMX_DISPLAY_SET_NUM);
		configDisplay.num = 0;
		configDisplay.layer = 2;
		configDisplay.transform = (OMX_DISPLAYTRANSFORMTYPE)0;

		if (dest_rect.x_offset || dest_rect.y_offset || dest_rect.width || dest_rect.height)
		{
			configDisplay.set = (OMX_DISPLAYSETTYPE)(configDisplay.set | OMX_DISPLAY_SET_DEST_RECT | OMX_DISPLAY_SET_FULLSCREEN | OMX_DISPLAY_SET_NOASPECT);
			configDisplay.dest_rect = dest_rect;
		}

		if (OMX_SetConfig(ILC_GET_HANDLE(video_render), OMX_IndexConfigDisplayRegion, &configDisplay) != OMX_ErrorNone)
			status = -15;

		set_tunnel(this->tunnel, this->video_decode, 131, this->video_render, 90);
		if (this->status == 0)
			ilclient_change_component_state(this->video_decode, OMX_StateIdle);

		memset(&this->format, 0, sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE));
		this->format.nSize = sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE);
		this->format.nVersion.nVersion = OMX_VERSION;
		this->format.nPortIndex = 130;
		this->format.eCompressionFormat = OMX_VIDEO_CodingAVC;

		OMX_PARAM_DATAUNITTYPE c;
		c.eUnitType = OMX_DataUnitCodedPicture;
		c.eEncapsulationType = OMX_DataEncapsulationRtpPayload;
		c.nPortIndex = 130;
		c.nVersion.nVersion = OMX_VERSION;
		c.nSize = sizeof(OMX_PARAM_DATAUNITTYPE);
		if (OMX_SetConfig(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamBrcmDataUnit, &c) != OMX_ErrorNone)
		{
			std::cout << "Decoder failed to configure low latency mode #1." << std::endl;
		}

		if (this->status == 0 &&
			OMX_SetParameter(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamVideoPortFormat, &this->format) == OMX_ErrorNone &&
			ilclient_enable_port_buffers(this->video_decode, 130, NULL, NULL, NULL) == 0)
		{
			this->first_packet = 1;
			this->port_settings_changed = 0;
			this->init = true;
		}

		ilclient_change_component_state(this->video_decode, OMX_StateExecuting);

		this->frameCount = 0;

		// Alloc a raw buffer for packet assembly per frame.
		this->dataPtr = new (std::nothrow) uint8_t[1024 * 512];
		this->truePtr = this->dataPtr; // store the true address of the buffer.
		
		this->dataPtr = (uint8_t*)MEM_ALIGN_UP((uint32_t)this->dataPtr, 16);

		this->ClearBuffer();

		ilclient_set_empty_buffer_done_callback(this->client, (ILCLIENT_BUFFER_CALLBACK_T)&Decoder::EmptyHandler, (void*)this);
		std::cout << "H264 Decoder Initialized." << std::endl;
	}

	void Decoder::EmptyHandler()
	{
		//std::cout << "buffer emptied " << this->frameCount << std::endl;
	}

	/* Initialize the Decode for File Playback */
	void Decoder::InitForFile(ILCLIENT_T* ilclient)
	{
		int ret = 0;
		memset(this->list, 0, sizeof(this->list));
		memset(this->tunnel, 0, sizeof(this->tunnel));
		this->client = ilclient;

		// create video decoder
		ret = ilclient_create_component(this->client, &this->video_decode, (char*)"video_decode", (ILCLIENT_CREATE_FLAGS_T)(ILCLIENT_DISABLE_ALL_PORTS | ILCLIENT_ENABLE_INPUT_BUFFERS));
		if (ret != 0)
		{
			std::cout << "Failed to create video decode component";
			return;
		}
		this->list[0] = this->video_decode;

		// create video renderer
		ret = ilclient_create_component(this->client, &this->video_render, (char*)"video_render", ILCLIENT_DISABLE_ALL_PORTS);
		if (ret != 0)
		{
			std::cout << "Failed to create video render component";
			return;
		}
		this->list[1] = this->video_render;

		// create clock
		ret = ilclient_create_component(this->client, &this->clock, "clock", ILCLIENT_DISABLE_ALL_PORTS);
		if (ret != 0)
		{
			std::cout << "Failed to create video clock component";
			return;
		}
		this->list[2] = clock;

		memset(&this->cstate, 0, sizeof(this->cstate));
		this->cstate.nSize = sizeof(this->cstate);
		this->cstate.nVersion.nVersion = OMX_VERSION;
		this->cstate.eState = OMX_TIME_ClockStateWaitingForStartTime;
		this->cstate.nWaitMask = 1;
		if (clock != NULL && OMX_SetParameter(ILC_GET_HANDLE(clock), OMX_IndexConfigTimeClockState, &cstate) != OMX_ErrorNone)
			status = -13;

		// create video_scheduler
		ret = ilclient_create_component(this->client, &this->video_scheduler, "video_scheduler", ILCLIENT_DISABLE_ALL_PORTS);
		if (ret != 0)
		{
			std::cout << "Failed to create video scheduler component";
			return;
		}
		this->list[3] = video_scheduler;

		set_tunnel(this->tunnel, video_decode, 131, video_scheduler, 10);
		set_tunnel(this->tunnel + 1, video_scheduler, 11, video_render, 90);
		set_tunnel(this->tunnel + 2, clock, 80, video_scheduler, 12);

		// setup clock tunnel first
		if (status == 0 && ilclient_setup_tunnel(this->tunnel + 2, 0, 0) != 0)
			status = -15;
		else
			ilclient_change_component_state(this->clock, OMX_StateExecuting);

		if (status == 0)
			ilclient_change_component_state(this->video_decode, OMX_StateIdle);

		memset(&this->format, 0, sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE));
		this->format.nSize = sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE);
		this->format.nVersion.nVersion = OMX_VERSION;
		this->format.nPortIndex = 130;
		this->format.eCompressionFormat = OMX_VIDEO_CodingAVC;

		if (this->status == 0 &&
			OMX_SetParameter(ILC_GET_HANDLE(this->video_decode), OMX_IndexParamVideoPortFormat, &this->format) == OMX_ErrorNone &&
			ilclient_enable_port_buffers(this->video_decode, 130, NULL, NULL, NULL) == 0)
		{
			this->first_packet = 1;
			this->port_settings_changed = 0;
			this->init = true;
		}

		ilclient_change_component_state(this->video_decode, OMX_StateExecuting);

		this->frameCount = 0;

		// Alloc a raw buffer for packet assembly per frame.
		this->dataPtr = new (std::nothrow) uint8_t[1024 * 256];
		this->ClearBuffer();

		std::cout << "H264 Decoder (File-Mode) Initialized." << std::endl;
	}

	void Decoder::ClearBuffer() {
		uint8_t* p = this->truePtr;
		memset(p, 0, 1024 * 512);
	}

	void Decoder::FrameReset() {
		packet_count = 0;
		first_packet = 1;
	}

	bool Decoder::ProcessPacket(RXP::Packet* pPacket) {
		
		int frameDataSize = pPacket->header.data_size;								// Total Size of the Frame Data.
		this->frameSize = frameDataSize;											// Keep a copy, in case we need to flush an incomplete frame later.
		int perPacketSize = PACKET_SIZE - (1 + sizeof(RXP::RXP_Header));			// How much data can fit in a single packet.
		int numPackets = (int)ceil((float)frameDataSize / (float)perPacketSize);    // Total No. of packets of data.

		uint8_t* p = this->dataPtr;
		p += (perPacketSize * pPacket->header.sequence_no);
		memcpy(p, pPacket->payload + 1, pPacket->header.payload_size-1);
		this->packet_count++;

		if (this->packet_count >= numPackets)
		{
			this->Decode(this->dataPtr, frameDataSize);
			this->packet_count = 0;
			return true;
		}
		return false;
	}

	int Decoder::GetFrameChecksum() {
		return(mod255(this->dataPtr, this->frameSize));
	}

	void Decoder::SetDisplay(int x, int y, int width, int height)
	{
		OMX_CONFIG_DISPLAYREGIONTYPE display;

		memset(&display, 0, sizeof(OMX_CONFIG_DISPLAYREGIONTYPE));

		display.nVersion.nVersion = OMX_VERSION;
		display.nSize = sizeof(OMX_CONFIG_DISPLAYREGIONTYPE);
		display.nPortIndex = 90;
		display.src_rect.x_offset = 0;
		display.src_rect.y_offset = 0;
		display.src_rect.width = 0;
		display.src_rect.height = 0;
		display.dest_rect.x_offset = (OMX_S16)x;
		display.dest_rect.y_offset = (OMX_S16)y;
		display.dest_rect.width = (OMX_S16)width;
		display.dest_rect.height = (OMX_S16)height;
		display.noaspect = OMX_TRUE;
		display.fullscreen = OMX_FALSE;
		display.set = (OMX_DISPLAYSETTYPE)(OMX_DISPLAY_SET_DEST_RECT | OMX_DISPLAY_SET_SRC_RECT | OMX_DISPLAY_SET_FULLSCREEN | OMX_DISPLAY_SET_NOASPECT);

		if (OMX_SetConfig(ILC_GET_HANDLE(this->video_render), OMX_IndexConfigDisplayRegion, &display) != OMX_ErrorNone)
		{
			std::cout << "Decoder failed to configure render display." << std::endl;
		}
	}

	void Decoder::Decode(unsigned char* pEncodedData, int dataSize)
	{
		this->status = 0;
		if (!this->init)
			return;
		while ((this->buf = ilclient_get_input_buffer(this->video_decode, 130, 0)) != NULL)
		{
			data_len = min(min(this->buf->nAllocLen, 8 * 1024), (uint32_t)dataSize);
			memcpy(this->buf->pBuffer, pEncodedData, data_len);
			
			//if(this->out)
				//fwrite(pEncodedData, 1, data_len, this->out);
			
			pEncodedData += data_len;
			dataSize -= data_len;

			if (port_settings_changed == 0 &&
				((data_len > 0 && ilclient_remove_event(this->video_decode, OMX_EventPortSettingsChanged, 131, 0, 0, 1) == 0) ||
				(data_len == 0 && ilclient_wait_for_event(this->video_decode, OMX_EventPortSettingsChanged, 131, 0, 0, 1,
					ILCLIENT_EVENT_ERROR | ILCLIENT_PARAMETER_CHANGED, 10000) == 0)))
			{
				this->port_settings_changed = 1;
				if (ilclient_setup_tunnel(this->tunnel, 0, 0) != 0)
				{
					this->status = -7;
					break;
				}
				ilclient_change_component_state(this->video_render, OMX_StateExecuting);
			}

			this->buf->nFlags = OMX_BUFFERFLAG_TIME_UNKNOWN;
			if (dataSize <= 0)
				this->buf->nFlags = OMX_BUFFERFLAG_ENDOFFRAME | OMX_BUFFERFLAG_ENDOFNAL;
			this->buf->nFilledLen = data_len;
			this->buf->nOffset = 0;
			
			if (OMX_EmptyThisBuffer(ILC_GET_HANDLE(this->video_decode), this->buf) != OMX_ErrorNone)
			{
				this->status = -6;
				break;
			}
			if (dataSize <= 0)
				break;
		}
		this->frameCount++;
	}

	void Decoder::OldDecode(unsigned char* pEncodedData, int dataSize)
	{
		this->status = 0;
		// If the decoder wasn't succesfully initialised, we cant decode.
		if (!this->init)
			return;

		// Amount of input frame data remaining.
		int bytes = dataSize;
		data_len = 0;

		// While we still have frame data remaining.
		while (bytes > 0)
		{

			// Get decoder input buffer.
			this->buf = ilclient_get_input_buffer(this->video_decode, 130, 1);
			if (this->buf == NULL)
			{
				std::cout << "Failed to get input buffer" << std::endl;
				break;
			}
			unsigned char* dest = this->buf->pBuffer;

			// All the data we have fits in the provided buffer.
			if ((uint32_t)bytes < this->buf->nAllocLen)
			{
				memcpy(dest, pEncodedData, bytes);
				fwrite(pEncodedData, 1, bytes, this->out);
				this->buf->nFlags |= OMX_BUFFERFLAG_ENDOFFRAME;
				this->buf->nFilledLen = bytes;
				data_len += bytes;
				bytes = 0;
			}
			// Else copy as much as we can and and come back again.
			else
			{
				memcpy(dest, pEncodedData, this->buf->nAllocLen);
				fwrite(pEncodedData, 1, this->buf->nAllocLen, this->out);
				this->buf->nFilledLen = this->buf->nAllocLen;
				bytes -= this->buf->nAllocLen;
				pEncodedData += this->buf->nAllocLen;
				data_len += this->buf->nAllocLen;
			}

			if (port_settings_changed == 0 &&
				((data_len > 0 && ilclient_remove_event(this->video_decode, OMX_EventPortSettingsChanged, 131, 0, 0, 1) == 0) ||
				(data_len == 0 && ilclient_wait_for_event(this->video_decode, OMX_EventPortSettingsChanged, 131, 0, 0, 1,
					ILCLIENT_EVENT_ERROR | ILCLIENT_PARAMETER_CHANGED, 10000) == 0)))
			{
				this->port_settings_changed = 1;
				if (ilclient_setup_tunnel(this->tunnel, 0, 0) != 0)
				{
					this->status = -7;
				}

				//ilclient_change_component_state(this->video_scheduler, OMX_StateExecuting);

				//if (ilclient_setup_tunnel(this->tunnel + 1, 0, 1000) != 0)
				//{
				//	this->status = -12;
				//}

				ilclient_change_component_state(this->video_render, OMX_StateExecuting);
			}

			/*if (this->first_packet)
			{
				this->buf->nFlags |= OMX_BUFFERFLAG_STARTTIME;
				this->first_packet = 0;
			}
			else
				this->buf->nFlags |= OMX_BUFFERFLAG_TIME_UNKNOWN;*/
			this->buf->nFlags = 0;

			if (OMX_EmptyThisBuffer(ILC_GET_HANDLE(this->video_decode), this->buf) != OMX_ErrorNone)
			{
				this->status = -6;
			}
		}

		this->frameCount++;

	}

	bool Decoder::DecodeFromFile(const char* pFileName)
	{
		if (!this->init)
			return false;

		if (this->in == NULL)
		{
			this->in = fopen(pFileName, "rb");
			if (this->in == NULL)
				return false;
		}

		ilclient_change_component_state(this->video_decode, OMX_StateExecuting);

		this->buf = ilclient_get_input_buffer(this->video_decode, 130, 1);
		if (this->buf != NULL)
		{

			unsigned char* dest = this->buf->pBuffer;
			int oldlen = data_len;
			data_len += fread(dest, 1, this->buf->nAllocLen - data_len, in);
			if (data_len == oldlen)
			{
				this->EndDecode();
				return false;
			}

			if (port_settings_changed == 0 &&
				((data_len > 0 && ilclient_remove_event(this->video_decode, OMX_EventPortSettingsChanged, 131, 0, 0, 1) == 0) ||
				(data_len == 0 && ilclient_wait_for_event(this->video_decode, OMX_EventPortSettingsChanged, 131, 0, 0, 1,
					ILCLIENT_EVENT_ERROR | ILCLIENT_PARAMETER_CHANGED, 10000) == 0)))
			{
				this->port_settings_changed = 1;
				if (ilclient_setup_tunnel(this->tunnel, 0, 0) != 0)
				{
					this->status = -7;
				}

				ilclient_change_component_state(this->video_scheduler, OMX_StateExecuting);

				if (ilclient_setup_tunnel(this->tunnel + 1, 0, 1000) != 0)
				{
					this->status = -12;
				}

				ilclient_change_component_state(this->video_render, OMX_StateExecuting);
			}

			this->buf->nFilledLen = this->data_len;
			this->data_len = 0;
			this->buf->nOffset = 0;

			if (this->first_packet)
			{
				this->buf->nFlags = OMX_BUFFERFLAG_STARTTIME;
				this->first_packet = 0;
			}
			else
				this->buf->nFlags = OMX_BUFFERFLAG_TIME_UNKNOWN;

			if (OMX_EmptyThisBuffer(ILC_GET_HANDLE(this->video_decode), this->buf) != OMX_ErrorNone)
			{
				this->status = -6;
			}
			return true;
		}
		return true;
	}

	void Decoder::EndDecode()
	{
		this->frameCount = 0;

		if (this->buf)
		{
			this->buf->nFilledLen = 0;
			this->buf->nFlags = OMX_BUFFERFLAG_TIME_UNKNOWN | OMX_BUFFERFLAG_EOS;
			if (OMX_EmptyThisBuffer(ILC_GET_HANDLE(this->video_decode), this->buf) != OMX_ErrorNone)
				this->status = -20;
		}

		// Wait for EOS from render.
		ilclient_wait_for_event(this->video_render, OMX_EventBufferFlag, 90, 0, OMX_BUFFERFLAG_EOS, 0, ILCLIENT_BUFFER_FLAG_EOS, 10000); //-1

		// Need to flush the renderer to allow video_decode to disable its input port.
		ilclient_flush_tunnels(this->tunnel, 0);

		this->FrameReset();

		std::cout << "Decoder stream terminated..." << std::endl;
	}

	void Decoder::Deinit()
	{
		ilclient_disable_tunnel(this->tunnel);
		ilclient_disable_tunnel(this->tunnel + 1);
		ilclient_disable_port_buffers(video_decode, 130, NULL, NULL, NULL);
		ilclient_teardown_tunnels(this->tunnel);
		std::cout << "1" << std::endl;

		ilclient_state_transition(this->list, OMX_StateIdle);
		ilclient_state_transition(this->list, OMX_StateLoaded);

		std::cout << "2" << std::endl;

		ilclient_cleanup_components(this->list);

		this->client = nullptr;
		std::cout << "4" << std::endl;

		// Free packet assembly buffer.
		delete[] this->truePtr;
	}

	Decoder::~Decoder()
	{
		if (this->in)
			fclose(this->in);
		if (this->out)
		{
			fflush(this->out);
			fclose(this->out);
		}
		this->Deinit();
	}

}

Return to “Graphics programming”