Sonny05
Posts: 22
Joined: Wed Jun 24, 2015 4:53 pm

Re: Trying Bare Metal on Raspberry Pi 2

Tue Aug 11, 2015 12:29 am

Hi all,

How to detect that the SD card has been inserted/removed?

Sonny05
Posts: 22
Joined: Wed Jun 24, 2015 4:53 pm

Re: Trying Bare Metal on Raspberry Pi 2

Wed Aug 12, 2015 10:45 pm

I tried via GPIO 47, but it does not work. It return same value, if card is removed/inserted.

User avatar
rpdom
Posts: 15453
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: Trying Bare Metal on Raspberry Pi 2

Thu Aug 13, 2015 6:46 am

There isn't a physical card detect switch on the Pi 2. Only the original Pi design with the full-sized SD card had one of those.

GPIO 47 is connected to the ACT LED.

The easiest way to detect if a card is present is to send it a command and see if it replies. Just a simple status request should do it.

krom
Posts: 61
Joined: Wed Dec 05, 2012 9:12 am
Contact: Website

Re: Trying Bare Metal on Raspberry Pi 2

Wed Dec 16, 2015 6:10 am

Hi, I just got all my Raspberry Pi 2 demos working again with the newest firmware using help from this forum post:
viewtopic.php?f=72&t=121993

I needed to convert the returned mailbox framebuffer BUS address to an ARM physical address to get GFX on the screen again in my demos.
https://github.com/PeterLemon/RaspberryPi

As rst states in that forum post linked above, the newest firmwares now are set to a HYP (Hypervisor) mode.
Even though I can still use SMP & turn on NEON instructions, I can not seem to set up the CPU system to use L1 Cache Branch Prediction / Instruction Caches.
I think I need to turn it back to SVC mode, to be able to speed my demos back upto the same speed as pre October 2nd firmwares...

I am going to read some ARM documentation to try todo this, but if anyone knows howto get into SVC mode, I'd love to know =D

Greets to Dex & rst

rst
Posts: 411
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: Trying Bare Metal on Raspberry Pi 2

Wed Dec 16, 2015 7:31 am

krom wrote:I am going to read some ARM documentation to try todo this, but if anyone knows howto get into SVC mode, I'd love to know =D
Hi krom,

this macro does it. "__MSR_ELR_HYP(14)" and "__ERET" emit the respective instructions "msr ELR_hyp, r14" and "eret" which are not supported in each assembler mode. THUMB() can be ignored.

Greets

vsiles
Posts: 41
Joined: Wed Feb 04, 2015 10:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Wed Dec 16, 2015 8:12 am

By the way, I recently checked the new firmware and found out that we now start in HYP mode (yes I'm a bit late on this one...)
Does anyone knows if it is the only difference ? Is the new "bootcode" (the one we can replace using kernel_old option) available somewhere ?

Best,
V.

krom
Posts: 61
Joined: Wed Dec 05, 2012 9:12 am
Contact: Website

Re: Trying Bare Metal on Raspberry Pi 2

Thu Dec 17, 2015 2:58 pm

Hi rst thanks for the help, this'll def help with getting out of HYP mode =D

I have been experimenting with the config.txt options:
kernel_old=1 (To avoid prepending the boot code.)
disable_commandline_tags=1 (To avoid populating the ATAGs.)

I found out the info for this from this post:
viewtopic.php?f=72&t=102261

This has the effect of not setting the RaspberryPi 2 into HYP mode in the 1st place!
(So all my demos run full speed using this boot process, with the newest firmware files.)
This makes all code start from ORG $0000 instead of ORG $8000.
Also it makes all the SMP cores start from ORG $0000.
(So no need to know those weird offsets that the normal boot process sets those extra SMP cores.)

I may convert all my RPi/RPi2 code to use this way of doing stuff, as it is very neat for RPi2 SMP baremetal coding.
It would save $8000 bytes of mem, & probably make my demos work with any future firmware changes.

Anyway hope you guys are having fun with your Raspberry Pi's =D
I have lots more todo for the V3D/GPU stuff, and once I get Z-Buffer working, it will be easy to make a simple baremetal 3D Engine for GPU!

krom
Posts: 61
Joined: Wed Dec 05, 2012 9:12 am
Contact: Website

Re: Trying Bare Metal on Raspberry Pi 2

Fri Dec 18, 2015 5:23 am

I have redone all my demos to run full speed on the newest firmware files, fixing the HYP mode problem on RPi2.
https://github.com/PeterLemon/RaspberryPi

I decided to use the config.txt options as I mentioned in the above post:
kernel_old=1 (To avoid prepending the boot code.)
disable_commandline_tags=1 (To avoid populating the ATAGs.)

All my demos start at ORG $0000 now instead of ORG $8000
And as all RPi2 SMP CPU cores start at ORG $0000, I check for their CPU ID, and do whatever work is needed for them.

When using this boot method the CPU never gets set to HYP mode, so all my demos run full speed now.

Sonny05
Posts: 22
Joined: Wed Jun 24, 2015 4:53 pm

Re: Trying Bare Metal on Raspberry Pi 2

Thu Jan 21, 2016 8:25 pm

I use this demo
https://github.com/PeterLemon/Raspberry ... ernel7.asm

I draw two triangles, which they have same coordinates x and y but other coordinates z.Only last triangle is drawn, if coordinate z has value<0,1> .Previous triangle is repainted. How can I also use the z-axis?

Thanks for advice

krom
Posts: 61
Joined: Wed Dec 05, 2012 9:12 am
Contact: Website

Re: Trying Bare Metal on Raspberry Pi 2

Fri Jan 22, 2016 12:19 pm

Hi Sonny05,

I got all my reference for V3D Broadcom GPU form this doc:
https://www.broadcom.com/docs/support/v ... G100-R.pdf

Hopefuly you can see how I have made all my V3D demos from the information in that doc.

I still have not been able to setup Z-Buffer correctly on this device, but it is possible...
I will try to make Z-buffer demo in the future, but you can try it out for yourself =D

Hope this helps, good luck.

/krom

arjunhary
Posts: 10
Joined: Tue Aug 25, 2015 3:03 am

Re: Trying Bare Metal on Raspberry Pi 2

Wed Sep 14, 2016 3:09 am

Hi,

I am trying to get the MMU working with multi core but have a few basic questions. I am following the examples posted by mrvn at his github site. I have MMU working with single core and also multi core boot working but i am running into some issues with multi core MMU.

1) Very basic question. Is there an MMU per core? I have looked through all the documents in looking for this answer. Looks like cortex A9 has an MMU per core.I cant figure this out for the cortex A7.
2) In case there is not, which is what mrvn's code looks like. What is the multi core sequence. I assume this is the high level sequence. Please correct it if it is wrong
a) setup page table
b) Turn on MMU, I and D caches for core 0
c) Switch to other cores and just turn on MMU , I and D caches. When i do this i see the IFSR and DFSR printed which make me think the page tables are not correct.
3) Also in openocd it says L2 cache present but not supported. Does turning on MMU in all cores by default turn on L2 cache?

I am attaching the file output from openocd for reference.

Warn : Using DEPRECATED interface driver 'ft2232'
Info : Consider using the 'ftdi' interface driver, with configuration files in interface/ftdi/...
Info : device: 6 "2232H"
Info : deviceID: 67330064
Info : SerialNumber: FTXRQP01A
Info : Description: FT2232H MiniModule A
Info : max TCK change to: 30000 kHz
Info : clock speed 2000 kHz
Info : JTAG tap: bcmrpi2.dap tap/device found: 0x4ba00477 (mfg: 0x23b, part: 0xba00, ver: 0x4)
Info : bcmrpi2.cpu0: hardware has 6 breakpoints, 4 watchpoints
Info : bcmrpi2.cpu1: hardware has 6 breakpoints, 4 watchpoints
Info : bcmrpi2.cpu2: hardware has 6 breakpoints, 4 watchpoints
Info : bcmrpi2.cpu3: hardware has 6 breakpoints, 4 watchpoints
Info : accepting 'gdb' connection on tcp/3333
Info : ttbcr 0ttbr0 7c675fc3ttbr1 fc97ede8
Info : bcmrpi2.cpu3 rev 5, partnum c07, arch f, variant 0, implementor 41
Info : number of cache level 2
Error: cache l2 present :not supported
Info : bcmrpi2.cpu3 cluster f core 3 multi core
target state: halted
target halted in ARM state due to debug-request, current mode: Supervisor
cpsr: 0x200001d3 pc: 0x00008460
MMU: disabled, D-Cache: disabled, I-Cache: disabled
Info : ttbcr 0ttbr0 1807attbr1 db035a8a
Info : bcmrpi2.cpu0 rev 5, partnum c07, arch f, variant 0, implementor 41
Info : number of cache level 2
Error: cache l2 present :not supported
Info : bcmrpi2.cpu0 cluster f core 0 multi core
target state: halted
target halted in ARM state due to debug-request, current mode: Supervisor
cpsr: 0x800001d3 pc: 0x00008460
MMU: enabled, D-Cache: enabled, I-Cache: enabled
Info : ttbcr 0ttbr0 adfef7ebttbr1 73521fc1
Info : bcmrpi2.cpu1 rev 5, partnum c07, arch f, variant 0, implementor 41
Info : number of cache level 2
Error: cache l2 present :not supported
Info : bcmrpi2.cpu1 cluster f core 1 multi core
target state: halted
target halted in ARM state due to debug-request, current mode: Supervisor
cpsr: 0x800001d3 pc: 0x00008460
MMU: disabled, D-Cache: disabled, I-Cache: disabled
Info : ttbcr 0ttbr0 7d52a632ttbr1 d7c53bd3
Info : bcmrpi2.cpu2 rev 5, partnum c07, arch f, variant 0, implementor 41
Info : number of cache level 2
Error: cache l2 present :not supported
Info : bcmrpi2.cpu2 cluster f core 2 multi core
target state: halted
target halted in ARM state due to debug-request, current mode: Supervisor
cpsr: 0x800001d3 pc: 0x00008460
MMU: disabled, D-Cache: disabled, I-Cache: disabled

vsiles
Posts: 41
Joined: Wed Feb 04, 2015 10:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Wed Sep 14, 2016 5:35 am

Quick reply which needs more work but don't have much time:
- yes you have one MMU per core
- if at some point you need to "share" data between a core with MMU on and a core with MMU off, don't forget to clean the caches first
- don't forget to set the ACTLR.SMP bit to activate cache coherency

arjunhary
Posts: 10
Joined: Tue Aug 25, 2015 3:03 am

Re: Trying Bare Metal on Raspberry Pi 2

Wed Sep 14, 2016 6:33 pm

1) I checked the AUX control register and smp bit is on by default (Bit 6)
2) Since it is an MMU per core, I assume there is a separate page table per core. So the MMU init sequence has to be called per core correct?

vsiles
Posts: 41
Joined: Wed Feb 04, 2015 10:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Thu Sep 15, 2016 5:41 am

The registers SCTLR, TTBR0, TTBR1, TTBRN, ACTLR are all "per core registers" so you have to perform the initialization procedure (don't forget to invalidate the I/D cache & TLB before enabling them) on each core. However if you want to, you can point the TTBRx to the same tables on each core, sharing the page tables (be careful for critical section when different cores are trying to access the same data)

arjunhary
Posts: 10
Joined: Tue Aug 25, 2015 3:03 am

Re: Trying Bare Metal on Raspberry Pi 2

Fri Sep 16, 2016 4:03 am

1) I got the MMU on multi core working. My mistake was not setting up the stack correctly. Once i set the stack correctly for all the cores, i can see that MMU and caches are enabled. I also learnt that declaring a variable as volatile makes it uncacheable or atleast that is the behavior i see. i have a global variable for the core index and each core keeps checking for this index to become equal to core id and then prints something on UART and then increments it. Even with caches on , the code keeps working. My understanding on global variables was quite the opposite. Am i missing something here?

Also quickly a shoutout to this forum for helping people like me learn a lot about baremetal programming and the HW itself. Now on to testing LDREX and STREX.

vsiles
Posts: 41
Joined: Wed Feb 04, 2015 10:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Fri Sep 16, 2016 5:38 am

If you want to implement synchronization between cores, you should try to find a simple implementation of spinlock (e.g. Linux) to do things right. Volatile doesn't mean that. Volatile tells the compiler that the variable might be modified by an external source it doesn't know about, so it won't make any assumptions on its value. For example:

Code: Select all

volatile unsigned int x = 0;
unsigned int y = 0;
if (x == 1)
  printf("foo\n");
if (y == 1)
  printf("bar\n");
In this case, the compiler is allowed to remove the second if completely, because it knows that this branch is deadcode. However, since x is volatile, some other core might change its value before the test, so the test is legit and should not be removed.

If you want uncacheable data, you should do so using the correct MMU flags

arjunhary
Posts: 10
Joined: Tue Aug 25, 2015 3:03 am

Re: Trying Bare Metal on Raspberry Pi 2

Fri Sep 16, 2016 2:35 pm

I am sorry i did not post the last question correctly. It was not on core synchronization or volatile. Let me give this another shot. my question was on caching of variables and where do the variables go.

1) previously i had the cache setup in write through mode(Page table flags were set to 0x1540A ). I have a global variable.It is NOT declared volatile. When one core updated it, the other cores were able to see the changes even though i did not do any cache flush or invalidate. This makes sense since cache is write through and updates goes to RAM
2) I changed the cache to be a write back cache (Page table flags were set to 0x1140E). I updated the global variable and the other cores were still able to see the change. Initially this does not make sense since update goes only to L1 cache and other cores should not see it. But reading the MPCore technical document , I see that there is a snoop control unit and i read this line

"Cortex-A7 MPCore processor supports between one and four individual processors with L1 data cache coherency maintained by the SCU."

I can see a need to flush caches when using the DMA but for for variable consistency , is there a need to flush and invalidate caches?

Also on performance improvement with caches. I am doing 3 64K memory compares looped for 5 times. I see a huge performance improvement with the caches. From write through to write back time drops from 48ms to 1.5ms (Function has UART prints in between too)

vsiles
Posts: 41
Joined: Wed Feb 04, 2015 10:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Fri Sep 16, 2016 3:09 pm

yes, the A7 has a L2 cache / SCU so there is automatic cache coherency so WB or WT is still cached, so there is a coherency. You only have to clean/flush when you enable/disable cache, or work with some device outside the cores, like DMA.
If you only want to share data between the cores, the L2 cache / SCU will do it for you automatically.

WB / WT is just a matter of choice for you caching policy, it won't affect that

madotuki
Posts: 4
Joined: Sat Feb 11, 2017 3:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Sat Feb 11, 2017 3:25 am

Hi,

I'm trying to get MMU working with multi core but they hangs when the they write 1 to SCTLR.M. :oops:

Primary core change the mode from HYP to SVC, set vectors to 0x00000000 and set sp for each mode (0x00100000 ~ 0x001FFFFC) and jump to system initialization process.

Primary core setups UART and Translation Tables. Then, it disables caches, init TTBR and enables MMU. This is working.
After that, primary core wakes up secondary cores.

Secondary cores change the mode and set the sp correctly. They also setup and enable MMU but they hangs.
Who knows the cause?
Any help would be much appreciated :)

The codes are here(boot.s, init.c, mmu.c).
https://github.com/madotuki/rpi2-baremetal

LdB
Posts: 1286
Joined: Wed Dec 07, 2016 2:29 pm

Re: Trying Bare Metal on Raspberry Pi 2

Sat Feb 11, 2017 11:34 am

Are you using the correct instruction to access the register and have the CP15SDISABLE low as per the manual.

http://infocenter.arm.com/help/index.js ... JAHDA.html

Unaligned access at the register or access not correctly dealing with CP15SDISABLE will immediately throw Undefined Instruction exception as per the manual and what it sounds like you are doing.

madotuki
Posts: 4
Joined: Sat Feb 11, 2017 3:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Sat Feb 11, 2017 2:08 pm

Hi LdB,

Thank you for your reply.

I check about the CP15SDISABLE input but I can't find out the way to change the state to LOW in ARM ARM, Cortex-A7 MPCore PRM.

Could you tell me the reference about it ?

Then, I add Alignment Bit to SCTLR but alignment fault wasn't caused.
MMU is working on primary core but it is not working on secondary cores...

Code: Select all

CPUID: 00000000
ACTLR: 00006040
start to take asid
ASID: 00000001
TTBA0: 00030000
SCTLR: 00C5187F
Enabled MMU

CPUID: 00000001
ACTLR: 00006040
NSACR: 00060C00
start to take asid
ASID: 00000001
TTBA0: 00030000

CPUID: 00000002
ACTLR: 00006040
NSACR: 00060C00
start to take asid
ASID: 00000001
TTBA0: 00030000

CPUID: 00000003
ACTLR: 00006040
NSACR: 00060C00
start to take asid
ASID: 00000001
TTBA0: 00030000

madotuki
Posts: 4
Joined: Sat Feb 11, 2017 3:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Mon Feb 13, 2017 12:58 am

I checked that primary core enables MMU and secondary cores can write to SCTLR register.
But, they are hangs with no abort.

I think that this isn't caused by CP15SDISABLE because secondary cores can write SCTLR.I and SCTLR.D bit.
I don't know why, they including primary core stop the working when secondary cores turn on MMU bit in SCTLR.

MMU functions are here. I start MMU by executing "init_mmu(); start_mmu(ttba, sctlr_flags, cpuid, pm)". (plz ignore pm)

Code: Select all

#define invalidate_icache() \
        __asm__ volatile ("mcr p15,0,%0,c7,c5,0" :: "r"(0) : "memory")

#define ISB() __asm__ volatile ("isb" ::: "memory")
#define DSB() __asm__ volatile ("dsb" ::: "memory")
#define DMB() __asm__ volatile ("dmb" ::: "memory")
#define flush_btcache()     \
        __asm__ volatile ("mcr p15,0,%0,c7,c5,6" :: "r"(0) : "memory")

void stop_mmu(void)
{
        __asm__ volatile("push {r2}\n"
                         "mrc p15,0,r2,c1,c0,0\n"
                         "bic r2,%0\n"
                         "mcr p15,0,r2,c1,c0,0\n"
                         "pop {r2}"
                          :: "r"(SCTLR_MMU_OPTION)
                          :  "memory"
                        );
        DSB();
        ISB();
}

void invalidate_caches(void)
{
        invalidate_data_cache_l10_only();
        invalidate_icache();
        flush_btcache();
        DSB();
        ISB();
}

void invalidate_all_tlbs(void)
{
        __asm__ volatile("mcr p15,0,%0,c8,c7,0" :: "r"(0) : "memory");
}

void init_mmu(void)
{
        //if (is_mmu_enabled())
        stop_mmu(); // Disable MMU, I-cache and D-cache

        invalidate_caches();
        invalidate_all_tlbs();

        // Turn on SMP bit in ACTLR
        uint32_t actlr;
        __asm__ volatile("mrc p15,0,%0,c1,c0,1" : "=r"(actlr));
        actlr |= (1 << 6);
        __asm__ volatile("mcr p15,0,%0,c1,c0,1" :: "r"(actlr) : "memory");

        // set domain (full access)
        __asm__ volatile("mcr p15,0,%0,c3,c0,0" :: "r"(0xFFFFFFFF) : "memory");

        // Always use TTBR0
        uint32_t ttbcr;
        __asm__ volatile("mcr p15,0,%0,c2,c0,2" :: "r"(0));

        DSB();
        ISB();
}

void sync_ttba_change(uint32_t new_ttba, const uint32_t cpuid,
                const uint32_t pm)
{
        DSB();
        uart_puts("dsb finished\n");

        __asm__ volatile("cpsid i");

        // Take and check ASID
        uart_puts("start to take asid\n");
        uint32_t asid = take_kernel_asid(cpuid, pm);
        register_value_puts("ASID: ", asid);
        if (asid < 0) {
                uart_puts("No ASID is available\n");
                __asm__ volatile("cpsie i");
                return;
        }

        // Set ASID RESERVED_ASID_NUM to change ttba and asid
        set_asid(RESERVED_ASID_NUM);

        // Set new TTBA to TTBA0/1
        __asm__ volatile("mcr p15,0,%0,c2,c0,0\n"
                         :: "r"(0x4a|new_ttba)
                        );
        __asm__ volatile("mcr p15,0,%0,c2,c0,1\n"
                         :: "r"(0x4a|new_ttba)
                        );
        ISB();

        // Set ASID
        set_asid(asid);

        // Eable interrupt
        __asm__ volatile("cpsie i");
}

void start_mmu(uint32_t tlb_base_address, uint32_t sctlr_flags,
                const uint32_t cpuid, const uint32_t pm)
{
        // Set ttbr and enable mmu;
        sync_ttba_change(tlb_base_address, cpuid, pm);
        DSB();

        // Enable MMU
        uint32_t sctlr = _get_sctlr();
        sctlr |= sctlr_flags; // on MMU
        __asm__ volatile("mcr p15,0,%0,c1,c0,0"
                          :: "r"(sctlr) : "memory"
                        );
        DSB();
        ISB();
        uart_puts("Enable MMU\n");
}

timanu90
Posts: 65
Joined: Sat Dec 24, 2016 11:54 am

Re: Trying Bare Metal on Raspberry Pi 2

Mon Feb 13, 2017 9:38 am

Secondary cores change the mode and set the sp correctly. They also setup and enable MMU but they hangs.
Secondary cores need to setup TTBR0 too. From your text I think you miss that.

cheers
Tiago

madotuki
Posts: 4
Joined: Sat Feb 11, 2017 3:04 am

Re: Trying Bare Metal on Raspberry Pi 2

Mon Feb 13, 2017 12:01 pm

Hi timanu90,

Code: Select all

CPUID: 00000000
ACTLR: 00006040
start to take asid
ASID: 00000001
TTBA0: 00030000
SCTLR: 00C5187F
Enabled MMU

CPUID: 00000001
ACTLR: 00006040
NSACR: 00060C00
start to take asid
ASID: 00000001
TTBA0: 00030000

CPUID: 00000002
ACTLR: 00006040
NSACR: 00060C00
start to take asid
ASID: 00000001
TTBA0: 00030000

CPUID: 00000003
ACTLR: 00006040
NSACR: 00060C00
start to take asid
ASID: 00000001
TTBA0: 00030000
I posted this log.
It says that primary core and secondary cores are use the same translation table base address 0x30000.

Return to “Bare metal, Assembly language”