Posts: 16
Joined: Fri Jul 20, 2012 6:50 am

Using DMA for a FillBlt

Thu Feb 07, 2013 3:40 am

I'm having trouble working with the DMA controller. I would like to use a DMA channel to fill memory (video buffer) with a constant value. It all sort of works using the 2D bit and the non-advance bit for source I can fill a pattern into memory.

The problem is that the DMA always wants to fetch 8 DWORDs from the source even without advancing. I thought I had a problem with the 128 vs 32 bit copy width but I've cleared the bit for both source and destination in the transfer descriptor. I suspect I'm still incorrect here because I don't see a change if these width bits are set or cleared.

This is an issue because I'm using 24 bit color (3 bytes/pixel) and an even number of pixels doesn't fit into 8 DWORDs resulting in a striped pattern.

Thoughts? Thanks.


User avatar
Posts: 876
Joined: Wed May 16, 2012 6:32 pm
Contact: Website

Re: Using DMA for a FillBlt

Thu Feb 07, 2013 4:57 am

Maybe thats because, it not 24bit, but 32bit.
As in XXRRGGBB XX = alpha channel
Batteries not included, Some assembly required.

Posts: 108
Joined: Mon Sep 10, 2012 4:14 pm

Re: Using DMA for a FillBlt

Thu Feb 07, 2013 12:21 pm

I've yet to get to driving DMA for screen updates, although I wanted to see the effect it would have on my console scrolling which feels a little slow at the moment. From what I understand the memory read might be 32 or 128 bits, but the write should only write the number of bytes specified, it should not have to be a full 32 or 128 bits written. My console is currently using 16 bit colour on the composite video out so this might be an issue for me too, although I suspect for a scroll of the full screen width that will end up being a 128bit boundary anyway.

Here's what I was expecting to code for doing a screen rectangle copy, fbSourceAddr is the address of the first byte of the source data (first pixel of the second row in the frame buffer for a scroll operation) and fbDestAddr the address of the first pixel in the framebuffer of the destination. Does this match what you're doing?

DMAControlBlock cbCopy =
height << 16 + width,
(screen width - rect width) << 16 + (screen width - rect width),
0, 0, 0

Posts: 16
Joined: Fri Jul 20, 2012 6:50 am

Re: Using DMA for a FillBlt

Fri Feb 08, 2013 6:25 pm

Did you ever read a post you'd written and realize how stupid the question was? That's me. Of course the 32 bit and 128 bit transfers are the same. A 24 bit pixel doesn't fit evenly in either block size.

I solved the problem by linking two DMA blocks. The first replicates the 24 bit pixel across the first row. I use a 3 byte 'horizontal' length and a 'vertical' length of the width of the row. This replicates the 3 byte color across the first row. The src stride is set to -3 to replicate the source. The destination stride is set to 0 to simply continue the replication across the same row.

The second DMA block then replicates the first row down the block. I link the two blocks and invoke the DMA. Only interesting thing is that the destination stride needs to be the video buffer stride minus the byte length of the row. The source stride is set to the compliment of the byte width of a row.

Posts: 26
Joined: Wed Jul 04, 2012 9:09 pm

Re: Using DMA for a FillBlt

Sun Feb 10, 2013 4:54 pm

This is how I transfer a "sprite" from contiguous memory to the screen, using a single DMA setup - it's broadly similar to what you describe in your previous post, but I thought some code might make it clearer for other folk.

In my code I set up structs/unions using the names of the registers and blocks from the BCM reference document to make it easier to follow what's happening.

This isn't the complete DMA code of course, but I think it illustrates the point.

Code: Select all

   dma_cb->ti.bits.waits = 0;
   dma_cb->ti.bits.permap = 0;
   dma_cb->ti.bits.burst_length = 0;
   dma_cb->ti.bits.src_ignore = 0;
   dma_cb->ti.bits.src_dreq = 0;
   dma_cb->ti.bits.src_width = 1;
   dma_cb->ti.bits.src_inc = 1;
   dma_cb->ti.bits.dest_ignore = 0;
   dma_cb->ti.bits.dest_dreq = 0;
   dma_cb->ti.bits.dest_width = 1;
   dma_cb->ti.bits.dest_inc = 1;
   dma_cb->ti.bits.wait_resp = 0;
   dma_cb->ti.bits.tdmode = 1;
   dma_cb->ti.bits.inten = 0;

   dma_cb->source_ad = (unsigned int)src_addr;
   dma_cb->dest_ad = (unsigned int)dest_addr;
   dma_cb->txfr_len.bits.xlength = 16;
   dma_cb->txfr_len.bits.ylength = 15;
   dma_cb->stride.bits.d_stride = 640 - 16;
   dma_cb->stride.bits.s_stride = 0;
   dma_cb->nextconbk = 0;
In this example the framebuffer is set up as 640x480x8, the xlength and ylength are the pixel sizes of the sprite. The d_stride is the amount to skip forward on the framebuffer after each copy (minus the x length of the copy - the dma seems to count from the end, not the start). The s_stride is set to 0 because my sprite data is in a contiguous block of memory.

For other bit depths, just remember to adjust xlength and d_stride accordingly. y_length will always be the number of "rows" in your sprite/transfer.

The sprite will appear at dest_ad which can be calculated using something like (y * pitch) + x.


Return to “Bare metal, Assembly language”