cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Thu Jun 29, 2017 4:01 pm

I do need the images as motion events are recorded with a series of images that can be played back on the browser.
If so then you should be able to shortcut the decode to only do the CABAC/CAVLC decode rather than the full decode. I'd hazard a guess that just using an ARM core that will be lighter weight and more efficient than a full decode and encode in the hardware. A Pi0/1 is likely to struggle, but a Pi 2 or 3 shouldn't bat much of an eyelid ... just checked and for a ballpark, on a Pi3 the CABAC decode on a 10Mbit/s stream is <10% of one CPU core.

That's a great idea. I could use h264_mmal to decode in hardware to generate the images. Then do a partial software decode to the CABAC/CAVLC stage to get the motion vectors. I would need to look at the ffmpeg sources to see how the decode pipeline was implemented. ( I took a cursory look and it is not going to be easy ).

I wonder how many concurrent hardware decodes can be done. You mentioned that as many can be done within 1080p60 pixel rate for decode. The cameras I am using have 25 frames per second. That would mean 1920x1080x60 ( 124416000 ) / ( 8448000 ) for a decode resolution of 704x480x25 , which equals 14. Am I figuring this right? Of course, other things such as memory and cpu use would have to be considered.

The fastest route to results right now though is to use mmal hardware decode followed by mmal hardware encode. I will focus on that for now.

I activated the sideframe info and I am getting buffers back which are flagged as sideinfo with the correct size at 5400 bytes. I read from
https://www.raspberrypi.org/blog/vector ... stimation/ that
What you are seeing is the magnitude of the vector for each 16×16 macroblock equivalent to the speed at which it is moving! The information comes out of the encoder as side information (it can be enabled in raspivid with the -x flag). It is one integer per macroblock and is ((mb_width+1) × mb_height) × 4 bytes per frame, so for 1080p30 that is 120 × 68 × 4 == 32KByte per frame.
In one of the posts:
Yes each integer in the output stream represents a single 16×16 block….

It is encoded as follows:
struct motion_vector {
short sad;
char y_vector;
char x_vector;
}

So encodes more than just the vector but also the SAD (Sum of Absolute Difference) for the block. You can look at this value to get a feel for how well the vector represents the match to the reference frame (I’ve ignored it in creating the gif)
I am able to get sensible data. I am looking at the SAD info and it has positive and negative values. So to get magnitude of movement of each 16x16 macroblock, I need to check two consecutive frames for macroblocks with the same SAD value and get the delta x and delta y? I assume the "reference frame" is the frame before. Or are the values of x_vector and y_vector the displacements themselves?

Thanks,
Chris

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Sun Jul 02, 2017 9:33 pm

This yields something that I could reliably detect as motion. That is to say that when I am moving something in front of the camera, I could see the vectors value jump up and then when motion stops, they go back to mostly zero.

Code: Select all

struct motion_vector {
    char x_vector;
    char y_vector; 
    unsigned short sad;
};
However, I can't visualize the shape of the object moving. Movement triggers but when I try to display the coordinates of the motion vectors , I can't recognize the shape of the moving object. I don't know how the coordinates of the motion vectors are extracted. Any hints?

I assumed that the motion vector macroblocks covered the entire frame like a row of tiles with the first row having 704/16 +1 tiles, with a total of 480/16 rows and that you go from left to right and down down one row at a time.

Code: Select all

   printf("SIDEINFO with buffer length of %d\n", buffer->length);
            int i=0;
            //5400 bytes divided into array of 4 byte chunks
            int num_mv=buffer->length/4;
            struct motion_vector mvarray[num_mv];
            //getchar();
            memcpy(mvarray,buffer->data,buffer->length);
            int vcount=0;
            mRGB = cv::Scalar(0,0,0);
            for ( i=0;i < num_mv ; i++) {
                    double magnitude= sqrt ( mvarray[i].x_vector*mvarray[i].x_vector + mvarray[i].y_vector*mvarray[i].y_vector);
                   if (magnitude > 0) {
                    vcount++;

                    int xcoord=(i*16) % 720;
                    int ycoord=((i*16)/720)*16;
                    cv::circle( mRGB, cv::Point(xcoord,ycoord), 1.0, cv::Scalar( 0, 0, 255 ), 1, 8 );
                   } 
            
        
              
Thanks,
Chris

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Mon Jul 03, 2017 1:14 am

I think my assumption was pretty near accurate. The video frame is tiled by 16x16 macroblocks left to right and then from top to down row by row with the extra column spilling out of the right side. I just needed a bit more imagination to recognize the moving shapes in front of the camera. I couldn't overlay the vectors over the video because it was too slow to display. However, instead of the upper left of the macroblock, it probably would be better to designate its center as the macroblock's true coordinate, which would just be 8 pixels to the right and below what I currently use. I will have to work on the overlay to really see if things match up. I wonder what a good SAD value to use as a threshold.

Thanks,
Chris

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Sat Jul 08, 2017 10:55 pm

Well I did confirm that it works by overlaying the motion vectors on video.

I was able to coordinate sending YUV420 data and receiving motion vectors for each particular frame. The way I did it was to send an EOS to the encoder after I get the side info buffer. The encoder responded with an EOS. I just wanted to make sure that I have received all the buffers before moving onto the next frame. Is this correct?

The code is posted on github.
https://github.com/cmisip/h264_motion_v ... l/main.cpp

I just wanted to make sure this is the proper way and that there are no unexpected side effects before I integrate this into a bigger project. Here is the snippet.
Thanks,
Chris

Code: Select all

#ifdef __arm__          
            if (decode_mode == mmal) {
                MMAL_BUFFER_HEADER_T *buffer;
                MMAL_BUFFER_HEADER_T *fbuffer;
                
                //send free buffer to encoder
                if ((buffer = mmal_queue_get(pool_out->queue)) != NULL) {
                   fprintf(stderr, "Sending free buffer to encoder output for frame number %d\n", video_frame_count);
                   if (mmal_port_send_buffer(encoder->output[0], buffer) != MMAL_SUCCESS) {
                      fprintf(stderr, "failed to send buffer");
                      return -1;
                   }
                } 
                
                //send buffer with yuv420 data to encoder
                if ((buffer = mmal_queue_get(pool_in->queue)) != NULL)  {
                   mmal_buffer_header_mem_lock(buffer);
                   if ((*vbuffer) && (buffer->data)) {
                       memcpy(buffer->data,*vbuffer,bufsize);  //copy frame->data to buffer->data
                       buffer->length=bufsize;
                   } 
                  mmal_buffer_header_mem_unlock(buffer);
                  
                  int64_t current_time = vcos_getmicrosecs64()/1000;
                  buffer->offset = 0; buffer->pts = buffer->dts = current_time;
           
                  fprintf(stderr, "sending %i YUV420 bytes for frame number %d\n", (int)buffer->length, video_frame_count);
                  if (mmal_port_send_buffer(encoder->input[0], buffer) != MMAL_SUCCESS) {
                       fprintf(stderr, "failed to send buffer");
                       return -1;
                  }
                } 
                
                
                received=false;
                while (!received) {
                
                 // if ((buffer = mmal_queue_get(context.queue)) != NULL) {
                    if (buffer = mmal_queue_timedwait(context.queue, 100)) {
                        
                    fprintf(stderr, "decoded frame\n");
                    fprintf(stderr, "receiving %i bytes for frame number %d\n", (int)buffer->length, video_frame_count );
                    
                    if(buffer->flags & MMAL_BUFFER_HEADER_FLAG_CONFIG) {
                        printf("HEADER bytes for frame number %d\n", video_frame_count);
                    }
                    else if(buffer->flags & MMAL_BUFFER_HEADER_FLAG_CODECSIDEINFO) {
                        printf("SIDEDATA for frame number %d\n", video_frame_count);
                        
                        mmal_buffer_header_mem_lock(buffer);
                        uint16_t size=buffer->length/4;
                        struct mmal_motion_vector mvarray[size];
                        
                        //copy buffer->data to temporary
                        memcpy(mvarray,buffer->data,buffer->length);
                        mmal_buffer_header_mem_unlock(buffer);
                        
                        
                        *mvect = (uint8_t *) malloc(mvect_size);
                        memset(*mvect,0,mvect_size);
                        memcpy(*mvect, &size, 2);
                        int offset=2;
                        
                        for (int i=0;i < size ; i++) {
                            motion_vector mvt;
                            mvt.height = 16;
                            mvt.width = 16;
                            mvt.xcoord = (i*16) % (video_dec_ctx->width + 16);
                            mvt.ycoord = ((i*16)/(video_dec_ctx->width+16))*16;
                            mvt.sad = mvarray[i].sad;
                            mvt.x_vector = mvarray[i].x_vector;
                            mvt.y_vector = mvarray[i].y_vector;
                            
                            memcpy(*mvect+offset,&mvt,sizeof(motion_vector));
                            offset+=sizeof(motion_vector);
                         } 
                        
                        //Send flush buffer to port
                        if ((fbuffer = mmal_queue_get(pool_in->queue)) != NULL)  {
                             mmal_buffer_header_mem_lock(fbuffer);
                             fbuffer->flags = MMAL_BUFFER_HEADER_FLAG_EOS| MMAL_BUFFER_HEADER_FLAG_FRAME_END | MMAL_BUFFER_HEADER_FLAG_FRAME_START;
                             fbuffer->length=0;
                             mmal_buffer_header_mem_unlock(fbuffer);
           
                             fprintf(stderr, "sending %i flush bytes for frame number %d\n", (int)fbuffer->length, video_frame_count);
                                 if (mmal_port_send_buffer(encoder->input[0], fbuffer) != MMAL_SUCCESS) {
                                    fprintf(stderr, "failed to send flush buffer");
                                    return -1;
                                 } else
                                    printf("Sent flush buffer for frame number %d\n", video_frame_count);
                        }  else 
                            printf("Failed to send flush buffer\n");
                        
                        
                    } else if (buffer->flags & MMAL_BUFFER_HEADER_FLAG_EOS) {
                        received=true;
                        printf("****************************FLUSH BUFFER RESPONSE ***************************\n");
                    }
                    else {
                           printf("DATA for frame number %d\n",video_frame_count);
                    } 
                    
                    mmal_buffer_header_release(buffer);
                    
                   //send free buffer to encoder if we have not received the flush response
                   if (!received) {
                      if ((buffer = mmal_queue_get(pool_out->queue)) != NULL) {
                         if (mmal_port_send_buffer(encoder->output[0], buffer) != MMAL_SUCCESS) {
                            fprintf(stderr, "failed to resend buffer");
                            return -1;
                         } else
                            printf("sending free buffer for frame number %d\n",video_frame_count);
                      } 
                   } 
                  } else
                      printf("NULL buffer");
                }    
                
                
            }
#endif

cpunk
Posts: 80
Joined: Thu Jun 29, 2017 12:39 pm

Re: H 264 Motion Detection - Compressed Domain

Sun Jul 09, 2017 1:11 pm

> The video frame is tiled by 16x16 macroblocks left to right and then from top to down
> row by row with the extra column spilling out of the right side.

Yes, that assumption helped me too. Unfortunately I don't know about the spill-over logic, so out of caution, I'm currently using only frame sizes divisible by 16 with no remainder.

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Tue Jul 11, 2017 11:25 pm

I managed to get the code up and running on a fork of ZoneMinder.
The code is on github:
https://github.com/cmisip/ZoneMinder

Thanks especially for 6by9's insights, who seem to have decided not to talk to me anymore... in this thread. Perhaps I should have started a new thread. :) Looking forward to your insights in the future.

Regarding the timing of yuv420 frame sending and receiving of sideinfo motion vectors. The simplest solution really was to continue to test the queue with :

Code: Select all

 while ((buffer = mmal_queue_get(context.queue)) != NULL) {
}
until all the buffers are processed and then to move on to the next frame. No need for the EOS stuff.

As far as the hardware limit for mmal decoding and encoding h264, my RPI3 complained when I tried a 4th camera at 640x360.

I am testing 5 cameras on the RPI3 right now, all running at 640x360. The first 3 cameras using the hardware decoder for h264 decode and then the hardware encoder for motion vectors. The other two cameras are using software h264 decoder and its associated sideframe motion vector data. This is split between five zmc capture processes and 5 zma analyse processes. This might be the RPI3's limit.

I could probably add another camera.....Now we have 6. Let's see what happens.

An interesting idea that I have not pursued which could be an alternative to hardware encode in the pursuit of motion vector data would be as 6by9 suggested, perform a partial software decode to CABAC stage only after doing a hardware decode.

Thanks cpunk for verifying my assumption as well.

Chris

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5814
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: H 264 Motion Detection - Compressed Domain

Wed Jul 12, 2017 10:09 am

I'm not ignoring the thread, just hadn't seen any questions that required a response when I skim read the posts (only so much time in the day)
I have just gone back over the thread looking for anything worth answering.

[email protected] would allow potentially 14 streams for decode within the hardware limits, although it will also depend on the total bitrate.
Please note that encode is using the same hardware block so that will take some of the resource (1 encode = ~1.5 decodes. 1080P30 is about the max on encode).
1080P60 decode is aided by automatically overclocking when the codec detects a "hard stream". Multiple decodes won't be seen as hard, so you may want to manually overclock.

If you want to confirm which motion vectors relate to which image, then look at the timestamp (pts) field in the buffer header. You're presumably filling that in with something from the decoder, so both motion vectors and encoded data should come out with the same value. If the encoded frame takes up more than 1 buffer, then the first comes out with the correct timestamp, whilst the remaining ones get MMAL_TIME_UNKNOWN ((INT64_C(1)<<63))

I have no extra information as to interpreting the motion vector data. It should be an array of your struct motion_vector vectors[(width/16)+1][height/16].
The encoder does clear any pixels that aren't in the active image but are still present in a macroblock. Encoding those partial macroblocks is going to be compromised as those cleared pixels still make up part of the motion vectors but obviously don't move, so in many cases they'll end up being mainly encoded via residuals.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
Please don't send PMs asking for support - use the forum.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Thu Jul 13, 2017 12:31 am

@6by9
Good to hear from you again.

For some reason, I can only use the hardware decoder and encoder on three cameras. If I try the fourth camera, I get errors.

I am using 640x360x24 right now. So that yields 124416000 / 5529600 or 22 streams. If I count each encode and decode separately, thats just 6 streams. The bitrate set in each camera is 8192 but I dont know how that factors in the calculation.

640x3 = 1920
360x3 = 1080

Is that just coincidence? I had initially assumed that the limit was because the combined resolution of the three cameras equal 1920x1080.

Is there a setting in encode and decode that needs to be set to allow encoding and decoding more streams? I would like to get 8 cameras working in hardware decode and encode if that is possible.

I am using ffmpeg's h264_mmal to do the decode.

Thanks,
Chris

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Sat Jul 15, 2017 9:01 pm

I have been trying different combinations and was successfully able to activate 4 cameras at 704x480 using hardware decode and encode after I adjusted the gpu memory to 256. It is a balancing act as at this resolution, zmc and zma consume 10-13% memory each which only leaves less than 20% for other processes.

I will do more tests.

Chris

cmisip
Posts: 100
Joined: Tue Aug 25, 2015 12:38 am

Re: H 264 Motion Detection - Compressed Domain

Sat Jul 15, 2017 9:40 pm

Now I have 5 cameras at 640x360 running hardware decode and encode. I don't dare add another one because It might start swapping. Load :

Code: Select all

 load average: 1.59, 1.69, 1.98
Memory utilization is at :

Code: Select all

KiB Mem:    766876 total,   718944 used,    47932 free,    31084 buffers
KiB Swap:   102396 total,    40036 used,    62360 free.   329208 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
19894 www-data  20   0  280292  80156  63280 S  16.9 10.5   1:18.85 zmc                                                                                                                                            
19855 www-data  20   0  280272  80072  63212 R  16.5 10.4   1:21.25 zmc                                                                                                                                            
19769 www-data  20   0  280232  80040  63176 S  15.9 10.4   1:17.30 zmc                                                                                                                                            
19808 www-data  20   0  280276  80084  63224 S  15.5 10.4   1:17.93 zmc                                                                                                                                            
19772 www-data  20   0  234256  67896  55272 S  14.9  8.9   1:09.70 zma                                                                                                                                            
19832 www-data  20   0  233444  67056  55312 S  14.6  8.7   1:08.00 zma                                                                                                                                            
19858 www-data  20   0  233444  66940  55196 S  14.2  8.7   1:07.83 zma                                                                                                                                            
19898 www-data  20   0  233444  66940  55196 R  14.2  8.7   1:07.93 zma                                                                                                                                            
20854 www-data  20   0  280248  79988  63176 S   9.3 10.4   0:31.20 zmc                                                                                                                                            
20865 www-data  20   0  233444  66924  55180 S   7.9  8.7   0:26.20 zma      
I'm going to have to look at the code again and see If I can reduce memory even further so I can add some more cameras.

Chris

jblumenkamp
Posts: 4
Joined: Sat Aug 11, 2018 11:46 am

Re: H 264 Motion Detection - Compressed Domain

Sat Aug 11, 2018 5:50 pm

Hi everyone!

I stumbled upon this a short time ago. I never really found a simple data interpreter for the raspivid -x option to visualize the optical flow, so I implemented it in python. It would be really nice if there would be more documentation about how to interpret the motion vector data...

Anyway, you can find it here:
https://github.com/janblumenkamp/raspivid-motionvectors

Here are two gifs, one showing dense optical flow and one the vectors, both recorded at a resolution of 640 x 480:
Image

Image

As you can see from the gifs, I encountered a small problem. I don't really know why or if this is due to my Raspberry Pi Zero setup, but there are sometimes some really weird artifacts. In the images i am moving my hand in front of the camera, and you can clearly see the hand. But sometimes the whole image becomes coloured, meaning all vectors "trigger" and show something that isn't actually here.

I am wondering what this is caused by? Maybe you can try out if you have similar problems with your setup? Is it due to too high compression or so?

Thanks!
Jan

EDIT: I tried the Raspberry Pi Cam v2 now and there it seems fine. Please let me know if you have any ideas about what settings to pass to raspivid in order to get rid of these artifacts with the v1.3 cam.

Return to “Camera board”