pmcg521
Posts: 34
Joined: Thu Jun 08, 2017 2:41 pm

Flush L1 cache BCM2837B0

Sat Jun 22, 2019 7:13 am

Pi 3 B+ with 32-bit bare-metal O/S. MMU is enabled for use of atomic operations (multicore). Caching on all memory except for peripherals. Now, USB transfers are not working with MMU caching enabled. The previously recognizable ethernet device now cannot initialize due to transfer errors. I suspected that the initial LAN7800 DMA transfers had been using stale values over DMA since L1 cache is enabled. So I wrote a (temporary, dumb) routine to flush the L1 data cache; see below. I iterate through each memory location (hopefully accessing cache lines) and call the clear/flush operation on it. I do this because the Cortex A53 doc says the data cache is PIPT (physically indexed + tagged).

Code: Select all


        MOV     r0, #0				;@ Start at zero
cloop:
        MCR     p15, 0, r0, c7, c10, 2	;@ Clean and flush the line by set/way (assuming r0 is the correct cache line index)
        ADD     r0, r0, #0x40			;@ Increment by 64 bytes (line is 16 words, 16*4=64 bytes)
        CMP     r0, #0x3F000000		;@ Stop at peripherals
        BNE     cloop					;@ If not branch back to cloop

Used ARM Cortex A53 manual to get coprocessor operations. The incrementation of 0x40 made sense to me because the data cache is 4-way set associative (according to the manual and contents of CCSIDR register) and 16 words * 4 cache lines = 64 (0x40). I'm calling the function before each DMA transfer -- no luck. How can I get the USB back up with MMU enabled? I'm wondering if it's a coherency/DMA issue or something else to do with the MMU.

Edit:
Worth noting that the memory translation tables are setup according to page 23 of this document (0x00015C06 is crucial):
http://infocenter.arm.com/help/topic/co ... essors.pdf

User avatar
Ultibo
Posts: 158
Joined: Wed Sep 30, 2015 10:29 am
Location: Australia
Contact: Website

Re: Flush L1 cache BCM2837B0

Wed Jun 26, 2019 12:01 am

pmcg521 wrote:
Sat Jun 22, 2019 7:13 am
How can I get the USB back up with MMU enabled? I'm wondering if it's a coherency/DMA issue or something else to do with the MMU.
You only talk in your description and your code snippet about cleaning or flushing the cache but for DMA operations to the USB controller you need two different scenarios.

After writing data to memory that will be transferred by DMA to USB you need to clean the data from the cache.

Before reading data from memory that has been transferred by DMA from USB you need to invalidate the data from the cache.

You should find the invalidate cache operations in the same place in the manual where the clean operations are, but you must take care when invalidating to restrict the invalidation to the smallest possible region that covers the data in question and nothing else or you risk corrupting unrelated data.
Ultibo.org | Make something amazing
https://ultibo.org

Threads, multi-core, OpenGL, Camera, FAT, NTFS, TCP/IP, USB and more in 3MB with 2 second boot!

pmcg521
Posts: 34
Joined: Thu Jun 08, 2017 2:41 pm

Re: Flush L1 cache BCM2837B0

Thu Jul 11, 2019 10:15 pm

Ultibo wrote:
Wed Jun 26, 2019 12:01 am
pmcg521 wrote:
Sat Jun 22, 2019 7:13 am
How can I get the USB back up with MMU enabled? I'm wondering if it's a coherency/DMA issue or something else to do with the MMU.
You only talk in your description and your code snippet about cleaning or flushing the cache but for DMA operations to the USB controller you need two different scenarios.

After writing data to memory that will be transferred by DMA to USB you need to clean the data from the cache.

Before reading data from memory that has been transferred by DMA from USB you need to invalidate the data from the cache.

You should find the invalidate cache operations in the same place in the manual where the clean operations are, but you must take care when invalidating to restrict the invalidation to the smallest possible region that covers the data in question and nothing else or you risk corrupting unrelated data.
Thanks, Ultibo. This led me in the right direction, but an issue remains. The initialization of USB devices now gets further than before. I am using the DCIMVAC instruction as such:

DMA transfer:

Code: Select all

...
// OUT transfers, copy data to DMA buffer
memcpy(dma_buf[chan], data, transfer.size);
_inval_area((uint32_t)dma_buf[chan]);

DCIMVAC and DCCMVAC routines:

Code: Select all

_inval_area:
        mcr     p15, 0, r0, c7, c6, 1         ;@ Invalidate line by VA to PoC
        bx      lr
        
_flush_area:
        mcr     p15, 0, r0, c7, c10, 1         ;@ Clean line by VA to PoC
        bx      lr
This single invalidation call resolved the hang I was getting before. But now, when attempting to read the LAN7800 device's configuration descriptor, the amount of endpoints are less than expected (there should be 5, but DMA is only getting 4 of them). It seems that there is a stale value in the transfer. I've tried using the above _flush_area routine, and I've tried using your suggestion to flush after writing and inval after reading (which should work in theory), but I can only get something to change with this single invalidate call.

In the Cortex A53 technical reference manual, section 6.2.4 states:
"DCIMVAC operations in AArch32 and DC IVAC instructions in AArch64 perform an
invalidate of the target address. If the data is dirty within the cluster then a clean is performed
before the invalidate."

Return to “Bare metal, Assembly language”