deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

multi-core shared memory communication

Sat Apr 14, 2018 6:39 pm

I am writing bare-metal code (running on a pi3b but in 32-bit mode).

I can successfully unpark the multiple cores and have them run, but I can't seem to get memory writes on one core to appear on the others.

I enable the MMU and caches on all cores. I also enable the SMP bit in the AUX register (do I need to enable it the ARMv8 way instead? With the SMPEN bit in CPUECTLR? If I do that though the machine locks up).

I would expect code like the following to work, but the while() loop on core0 never exits

Code: Select all

#define NUMCORES       4
static volatile uint32_t core_booted[NUMCORES];

/* running on core 1 */
void secondary_boot_c(int core) {
        /* Set up cache and MMU */
        enable_mmu();
        printk("Booted core %d\n",core);
        core_booted[core]=1;
}
 
 /* running on core0, never exits */
 
 /* wait for core1 to finish booting */
 while(core_booted[1]==0) ;
 
I've tried adding memory barriers but it doesn't seem to help.

User avatar
Ultibo
Posts: 130
Joined: Wed Sep 30, 2015 10:29 am
Location: Australia
Contact: Website

Re: multi-core shared memory communication

Sun Apr 15, 2018 1:15 am

deater wrote:
Sat Apr 14, 2018 6:39 pm
I can successfully unpark the multiple cores and have them run, but I can't seem to get memory writes on one core to appear on the others.

I enable the MMU and caches on all cores. I also enable the SMP bit in the AUX register (do I need to enable it the ARMv8 way instead? With the SMPEN bit in CPUECTLR? If I do that though the machine locks up).
In order to support cache coherence between cores you need to ensure that the page table entries contain the shared flag, you don't show the code you are using but normally this is done by setting the S bit in each entry.

See the ARM Architecture Reference Manual for more information.
Ultibo.org | Make something amazing
https://ultibo.org

Threads, multi-core, OpenGL, Camera, FAT, NTFS, TCP/IP, USB and more in 3MB with 2 second boot!

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Sun Apr 15, 2018 3:07 pm

Ultibo wrote:
Sun Apr 15, 2018 1:15 am

In order to support cache coherence between cores you need to ensure that the page table entries contain the shared flag, you don't show the code you are using but normally this is done by setting the S bit in each entry.

See the ARM Architecture Reference Manual for more information.
I'm using 0x9040e for the pagetable bits (I'm doing a 1:1 phys/virt mapping with section type pages).

So as far as I can tell I am setting the shared bit. However it's quite possible I have some of the inner/outer TEX/AP/CB settings wrong.

Schnoogle
Posts: 40
Joined: Sun Feb 11, 2018 4:47 pm

Re: multi-core shared memory communication

Mon Apr 16, 2018 11:03 am

Hi deater,

well even when the memory region is set to be shared and un-cached I did had the same issue that during the boot-up of the different cores the flag that a core is ready was not proper read by the other core. I've solved this by clearing/invalidating the cache in the core that has updated the flag as well as on the core waiting for the flag at least once after the MMU has been setup.

If needed I can share the code bit's and MMU settings I'm using that got it to work on my RPi3 in 32Bit mode..

By the way - setting the SMPEN flag in CPUECTR register on RPi3 using the ARMv8 way did also locked my pi and writing to the ARM community did not get any response till now ...

BR Schnoogle

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Tue Apr 17, 2018 4:53 am

Schnoogle wrote:
Mon Apr 16, 2018 11:03 am
well even when the memory region is set to be shared and un-cached I did had the same issue that during the boot-up of the different cores the flag that a core is ready was not proper read by the other core. I've solved this by clearing/invalidating the cache in the core that has updated the flag as well as on the core waiting for the flag at least once after the MMU has been setup.
Thanks, I was finally able to get core0 to see the writes by the other cores by invalidating the caches
as you describe. I had to invalidate the entire L1 dcache, just invalidating the memory addresses of the shared data structure didn't seem to work.

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Thu Apr 19, 2018 5:20 pm

deater wrote:
Tue Apr 17, 2018 4:53 am
Thanks, I was finally able to get core0 to see the writes by the other cores by invalidating the caches
as you describe. I had to invalidate the entire L1 dcache, just invalidating the memory addresses of the shared data structure didn't seem to work.
I take it back, flushing the data cache only seems to work *most* of the time. Still trying to track this down, a shame the ARM documentation isn't more clear on this topic.

I tried to track down what Linux does, but I couldn't even find the place where Linux sets SMP in at ACTL register (I did find in the header files where it disables it at times).

Also, despite the ARM documentation talking about the SCU (Snoop Control Unit), I get the impression that we shouldn't have to mess with that to get things working?

rst
Posts: 303
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: multi-core shared memory communication

Thu Apr 19, 2018 6:09 pm

The SMP bit is already set by the ARM stub, before your code is started, here:

https://github.com/raspberrypi/tools/bl ... tub7.S#L69

You do not need to touch it. There is only one exception: You use the "kernel_old=1" setting in config.txt.

I'm sure, you do not need to do cache maintenance (clean or invalidate) to be able to share data between cores. You can write to a memory location from one core and this should be observable from another core immediately. Everything else would be much too slow. The SCU is enabled by the SMP bit. There are no further actions needed.

I don't know, why this is not working for you, but perhaps this helps, to find it out. Maybe there is something wrong in your translation table (e.g. S bit must be set) or other MMU configuration.

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Fri Apr 20, 2018 11:14 pm

rst wrote:
Thu Apr 19, 2018 6:09 pm
The SMP bit is already set by the ARM stub, before your code is started, here:

https://github.com/raspberrypi/tools/bl ... tub7.S#L69

You do not need to touch it. There is only one exception: You use the "kernel_old=1" setting in config.txt.
I don't have any kernel_old setting in config.txt

Unless I am missing something, this can't be the stub that's booting on my machine as the stub turns on the I/D caches and on my machine they are definitely disabled at boot. Also I'm pretty sure I read out and test for the SMP bit and it isn't set.

Might this be a secure/non-secure issue? It looks like the stub switches to non-secure mode after setting the cache/SMP settings.

I am running a fairly recent copy of the firmware (though not the most recent as it doesn't have 3b+ support)

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Sat Apr 21, 2018 12:45 am

deater wrote:
Fri Apr 20, 2018 11:14 pm
Unless I am missing something, this can't be the stub that's booting on my machine as the stub turns on the I/D caches and on my machine they are definitely disabled at boot. Also I'm pretty sure I read out and test for the SMP bit and it isn't set.
I think I found out the issue. And indeed the stub you link to must not be what's on my card.
It was a great hint though.

I think all of the cores are booting in hypervisor mode, and I'm only disabling it on core0.

That might also explain why if I tried to set SMPEN it was locking up, it was probably trapping in the hypervisor.

rst
Posts: 303
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: multi-core shared memory communication

Sat Apr 21, 2018 4:40 am

It's working now. That's good. Yes, all cores are initially in HYP mode.

I have compared the referenced armstub7.S source code with the stub which is actually loaded on my RPi 3B with a firmware from Apr 9. I'm sure, it's the same here.

Setting the data cache enable bit in the control register has no effect without enabling the MMU. I did read back the CPU Extended Control Register here earlier. The SMP bit was set without touching it before by myself.

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Sat Apr 21, 2018 1:19 pm

rst wrote:
Sat Apr 21, 2018 4:40 am

I have compared the referenced armstub7.S source code with the stub which is actually loaded on my RPi 3B with a firmware from Apr 9. I'm sure, it's the same here.

Setting the data cache enable bit in the control register has no effect without enabling the MMU. I did read back the CPU Extended Control Register here earlier. The SMP bit was set without touching it before by myself.
Yes, I was wrong about SMPEN not being set, it does seem to be set.

However I am pretty sure that my SCTLR register is coming up as 0xc50838 which has the caches disabled.

Is there a way to see the stub in the fimrware by disassembling start.elf or similar? Or is the only way to write a bootloader that dumps it out?

deater
Posts: 27
Joined: Fri Mar 11, 2016 3:58 pm
Location: 45N

Re: multi-core shared memory communication

Sat Apr 21, 2018 1:38 pm

deater wrote:
Sat Apr 21, 2018 1:19 pm

However I am pretty sure that my SCTLR register is coming up as 0xc50838 which has the caches disabled.
Well it looks like I am wrong again. I forgot that when trying to get things working I had implemented the boot ordering described in the ``Bare-metal Boot Code for AMv8-A Processors'' document from ARM, which has you disable the caches before setting up the page table. And I wasn't printing SCTLR until after that.

rst
Posts: 303
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: multi-core shared memory communication

Sun Apr 22, 2018 8:12 am

deater wrote:
Sat Apr 21, 2018 1:19 pm
Is there a way to see the stub in the fimrware by disassembling start.elf or similar? Or is the only way to write a bootloader that dumps it out?
I used a program, which copies the firmware from 0x0 and writes it out to the SD card. You can also search for the firmware in the start.elf binary, if you use some bytes which are inside the firmware.

ChrisYin
Posts: 1
Joined: Tue May 15, 2018 9:03 pm

Re: multi-core shared memory communication

Tue May 15, 2018 9:14 pm

Hi, deater

I'm also working on bare metal things in Pi3.

I've successfully booted four cores, but only two of them can open its mmu successfully (one BSP and one AP).

It's really strange. Why other cores don't work with the same codes?

I just curious about how to initialize mmu in multicores? Do they share the same pagetable?

Schnoogle
Posts: 40
Joined: Sun Feb 11, 2018 4:47 pm

Re: multi-core shared memory communication

Tue Jun 05, 2018 6:53 pm

Hi ChrisYin,

well as far as I'm aware the four cores are sharing the same memory - so they are sharing the same page table. I've all 4 cores running with active MMU. The most important thing from my point of view is to ensure that the cores are lifted off one after the other.
So I introduced a core-ready counter which will be read by each core before going to initialization stuff.

The initial activities an each core are:
1. get the current core ID
2. based on the core ID calculate and set the initial stack pointer (the core is HYP mode at start-up)
3. disable caches
4. invalidate MMU-TLB, iCache, branch predicter cache and the whole D-cache
5. get the current core-ready counter. If it is the same value as the current coreID continue, else wait and check again
6. now do the whole initialization - switch to the right mode, setup stack pointers in the new mode etc....
7. setup MMU, enable caches etc.
8. increase the core ready counter to kick off the next core
9. jump to the C-function dedicated to this core to start idle there or pick up a dedicated task...


Hope this helps ...
BR,
Schnoogle

AlfredJingle
Posts: 67
Joined: Thu Mar 03, 2016 10:43 pm

Re: multi-core shared memory communication

Wed Jun 13, 2018 12:14 pm

Hi ChrisYin

The page-table-location is set per core, and so the page-table must not be the same for all cores. This can be used, for instance, to make sure that one core cannot see or overwrite the code from another core. This by having the different page-tables point to different parts of physical memory. Obviously using one shared page-table is easier and less work.

It is possible to start all 4 cores at once, but the way schnoogle describes it is easier. If you would want to start all cores at once, you have to meticulously make sure that at all times the resources for the different cores are kept separate, for instance that during exceptions like IRQ, FIQ, MON and SVC at all times the stack-pointers are set and kept core-specific and non-overlapping.
going from a 6502 on an Oric-1 to an ARMv8 is quite a big step...

Return to “Bare metal, Assembly language”

Who is online

Users browsing this forum: No registered users and 6 guests