leiradel
Posts: 32
Joined: Wed Feb 13, 2019 10:38 pm

MMU Descriptors

Mon Apr 01, 2019 7:33 pm

Hi everyone,

I'm writing code to turn the MMU on but I'm struggling to come with the values for the descriptors...

I'm trying to implement something similar to what dwelch did with his implementation [1], but I want to map the peripherals, the VC memory, and also memory to use when exchanging messages with the VC. I've read a lot of stuff on the Internet, including the ARM official documentation, but I'm still confused.

dwelch's code gave me the formula to create descriptors for normal RAM, but what values should I use for peripherals and VC memory that I want to access from ARM code? bzt mentioned using nGnRE for the peripherals, but it's related to the MAIR register and I believe it doesn't work with short descriptors, which I believe it's what I'm using?

I also took a look at bzt's code [2], but it's for AArch64 and I'm using a RPi 1.

If someone can shed some light here it'll be great. For the record, I want to implement a memory layout like this [3]. Everything except the user code should be easy to setup, as it's just fixed entries in the TLB.

Thanks in advance,

Andre

[1] https://github.com/dwelch67/raspberrypi/blob/master/mmu
[2] https://github.com/bztsrc/raspi3-tutori ... tualmemory
[3] https://gist.github.com/leiradel/67059c ... 33b505f887

LdB
Posts: 1224
Joined: Wed Dec 07, 2016 2:29 pm

Re: MMU Descriptors

Tue Apr 02, 2019 12:27 am

The setup for the ARM6 MMU is very different to the ARM8 AAARCH64 there is no MAIR register it works on domains. The AAARCH64 example such as bzt will do nothing but mislead and confuse you.

If you like and can follow David version use it just map non-cached over the peripheral area which I believe is just 0x0000 on his flags.
For the VC4 memory its optional, non-cache is slower but easier on your primitives. You will get more speed if cached but you have the issue if you do things like setpixel, getpixel you need cache controls on your graphics primitives.

leiradel
Posts: 32
Joined: Wed Feb 13, 2019 10:38 pm

Re: MMU Descriptors

Tue Apr 02, 2019 7:44 pm

LdB wrote:
Tue Apr 02, 2019 12:27 am
The setup for the ARM6 MMU is very different to the ARM8 AAARCH64 there is no MAIR register it works on domains.

Thanks, this already helps a lot since I can focus on a different path.
LdB wrote:
Tue Apr 02, 2019 12:27 am
If you like and can follow David version use it just map non-cached over the peripheral area which I believe is just 0x0000 on his flags.

Hm, he uses 0x0000 with all the descriptors. In mmu_section, the flags are or'ed with 0xc02, so 0x0000 gives:

  • Secure (NS=0)
  • Global (nG=0)
  • Non-shared (S=0)
  • Strongly ordered (TEX=0b000, C=0, B=0)
  • Kernel and user read/write (AP=0b11)
  • ECC disabled (P=0)
  • Domain 0
  • Executable (XN=0)
  • 1 MB section (Bits[1:0]=0b10)
For RAM, I wonder if the memory should be of this type. I believe it should be one of:

  1. Outer and Inner Write-Through, No Allocate on Write
  2. Outer and Inner Write-Back, No Allocate on Write
  3. Outer and Inner Noncacheable
  4. Outer and Inner Write-Back, Allocate on Write
Option #3 is noncacheable, so we can cross it out. Option 4 is not really supported, since ARM1176JZF-S doesn't support allocate on write. This leaves me with write-through and write-back. Wikipedia says:

  • Write-through: write is done synchronously both to the cache and to the backing store.
  • Write-back (also called write-behind): initially, writing is done only to the cache. The write to the backing store is postponed until the modified content is about to be replaced by another cache block.
so I think I should go with write-back for regular RAM, and with write-through for VC RAM. Maybe I should also specify Shared for VC RAM? I'll also have some normal, non-cacheable memory to use with the mailbox property interface.

For the devices, bzt said the peripherals should be mapped with Device-nGnRE on AArch64, which means:

  1. Non-gathering (nG): the number and size of accesses on the memory bus performed to that location must exactly match the number and size of explicit accesses in the code. It means the CPU cannot merge two byte accesses into one half-word access.
  2. Non-reordering (nR): accesses within the same block always appear on the bus in program order. The size of this block is implementation defined.
  3. Early-write Acknowledgement (E): it is permissible for a buffer in the interconnect logic to signal write acceptance, in advance of the write actually being received by the end device.
I don't think #1 and #3 are applicable to the ARM1176JZF-S, are they? Non-reordering can be achieved with TEX=0b000, C=0, and B=0, which is what David is doing.

I also want to use the eXecute Never bit with all data sections and pages, and with the peripherals.
LdB wrote:
Tue Apr 02, 2019 12:27 am
For the VC4 memory its optional, non-cache is slower but easier on your primitives. You will get more speed if cached but you have the issue if you do things like setpixel, getpixel you need cache controls on your graphics primitives.

I want to have accelerated graphics with the VC so no get/set pixel functions, but that's for later.

Thanks for answering back, and I appreciate if you can help me think this through.

Andre

leiradel
Posts: 32
Joined: Wed Feb 13, 2019 10:38 pm

Re: MMU Descriptors

Sat Apr 06, 2019 5:51 pm

I'm moving forward with the following setup. All descriptors use this base:

  • Non-secure (NS=1): I don't want to bother with this secure/insecure stuff
  • Global (nG=0): not sure what exactly this means
  • Non-shared (S=0): may need to change this when using multicore CPUs
  • ECC disabled (P=0): not supported in the ARM1176JZF-S anyway
  • Domain 0: I don't think I'll need to bother with domains for what I want to do
TEX, C, and B will be set as below:

  • Write-back for normal RAM
  • Write-through for VideoCore RAM (I'm thinking if this is necessary at all, maybe will try to use DMA for transfers to the VC RAM)
  • Strongly ordered for the peripherals
  • Noncacheable for a small amount of RAM to use with mailboxes
I'll set APX, AP, and XN as needed.

Now I have to write code to setup the descriptors. I'll use supersections, since they use less entries in the Micro and Main TLBs, and I also want to use 64 and 4 kB small pages where appropriate to avoid throwing out more RAM than is needed at the boundaries of attribute changes.

Let's see how it goes.

Return to “Bare metal, Assembly language”