LizardLad_1
Posts: 133
Joined: Sat Jan 13, 2018 12:29 am

Raspbery Pi 3 Undefined instruction

Tue Jul 02, 2019 1:16 am

Hello all, I'm back! I've has an issue which I haven't been able to pinpoint for a month or so now soi I was wondering if any of you would be willing to help. While running my kernel in QEMU I've come across a strange error.

Taking exception 1 [Undefined Instruction]
...from EL2 to EL2
...with ESR 0x0/0x2000000
...with ELR 0x200
...to EL2 PC 0x200 PSTATE 0x3c9

It repeats again and again. I don't know how to make QEMU pause after the first exception so I don't know the origin. I haven't been able to track down the code change to have done this either. Here is a link to my GitHub repo in the hopes that someone can find it: https://github.com/OllieLollie1/Raspi3-Kernel

I think I got the EDID read correctly using the mailbox but I'm not sure about that so could someone also take a quick look at that. It uses LdB's mailbox functions so he might help with using that.

Thanks to anyone that can help! :)

lpoulain
Posts: 24
Joined: Mon May 20, 2019 12:35 am

Re: Raspbery Pi 3 Undefined instruction

Tue Jul 02, 2019 1:34 am

The ELR is the address where it crashes. And 0x200 seems too low for the code. Which would explain the "undefined instruction" if somehow the CPU's instruction pointer register is redirected to an invalid address.

One possible root cause would be stack corruption caused by a wild write (which are a pain in the neck to troubleshoot)

One thing you can try to do is to print the stack when you encounter the error. By reading the stack, this should at least give you some ideas about where in your code does the crash happen.

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Raspbery Pi 3 Undefined instruction

Tue Jul 02, 2019 3:09 am

Since you are using C I can not be of a lot of help, though:

You may wish to setup a handler for undefined instruction exceptions (runs in UND mode), that does not use the stack. This is the setup by having the second instruction in your vector table load the address of your Undifined instruction handler (that is the instruction at vector table base address plus 4, if you have left the default of vector table at address 0, then it is address 4).

In order to keep from using the stack you are going to have to write that one in assembly. Though dumping the stack for a few frames up to an area of memory you can examine, then halting the CPU (WFE instruction) is a good place to start. Beings that you are using C it is very likely that something clobbering the stack is the problem.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

LizardLad_1
Posts: 133
Joined: Sat Jan 13, 2018 12:29 am

Re: Raspbery Pi 3 Undefined instruction

Wed Jul 03, 2019 12:32 am

Thanks for the info. I put in a branch to what will become a handler but it doesn't seem to work.
Here is the handler:

Code: Select all

.global _UND_handler
_UND_handler:
        //Dump the stack from here somehow
        wfe
        b _UND_handler
Here is the vector table

Code: Select all

// important, code has to be properly aligned
.balign 0x800
_vectors:
        //As DavidS said use address _vectors+4
        b _hang
        b _UND_handler

        /* from current EL with sp_el0 */
        // synchronous
        .balign 0x80
        b _hang

        // IRQ
        .balign 0x80
        b _hang

        // FIQ
        .balign 0x80
        b _hang

        // SError
        .balign 0x80
        b _hang

        /* from current EL with sp_elx, x != 0 */
        // synchronous
        .balign 0x80
	stp     x29, x30, [sp, #-16]!    // Save x30 link register and x29 just so we dont waste space
        bl      register_save            // Save corruptible registers .. it assumes x29,x30 saved
        //bl      dbg_saveregs
        mov     x0, #0
        bl      dbg_decodeexc
        bl      dbg_main
        bl      register_restore        // restore corruptible registers .. does all but x29,x30
        ldp     x29, x30, [sp], #16             // restore x29,x30 pulling stack back up 16
        eret

        // IRQ
        .balign  0x80
        stp     x29, x30, [sp, #-16]!    // Save x30 link register and x29
        bl      register_save
        //bl      dbg_saveregs
        //mov     x0, #1
        //bl      dbg_decodeexc
        bl      c_irq_handler           //dbg_main is now only called if the IRQ didn't come from something that was set to deliver interupts
        bl      register_restore        // restore corruptible registers .. does all but x29,x30
        ldp     x29, x30, [sp], #16             // restore x29,x30 pulling stack back up 16
        eret

        // FIQ
        .balign  0x80
        stp     x29, x30, [sp, #-16]!    // Save x30 link register and x29 just so we dont waste space
        bl      register_save            // Save corruptible registers .. it assumes x29,x30 saved
        //bl      dbg_saveregs
        mov     x0, #2
        bl      dbg_decodeexc
        bl      dbg_main
        bl      register_restore        // restore corruptible registers .. does all but x29,x30
        ldp     x29, x30, [sp], #16             // restore x29,x30 pulling stack back up 16
        eret

        // SError
        .balign  0x80
        stp     x29, x30, [sp, #-16]!    // Save x30 link register and x29 just so we dont waste space
        bl      register_save            // Save corruptible registers .. it assumes x29,x30 saved
        //bl      dbg_saveregs
        mov     x0, #3
        bl      dbg_decodeexc
        bl      dbg_main
        bl      register_restore        // restore corruptible registers .. does all but x29,x30
        ldp     x29, x30, [sp], #16             // restore x29,x30 pulling stack back up 16
        eret

        /* from lower EL, target minus 1 is AARCH64 */
        // synchronous
        .balign 0x80
        b _hang

        // IRQ
        .balign 0x80
        b _hang

        // FIQ
        .balign 0x80
        b _hang

        // SError
        .balign 0x80
        b _hang

        /* from lower EL, target minus 1 is AARCH32 */
        // synchronous
        .balign 0x80
                b _hang

        // IRQ
        .balign 0x80
        b _hang

        // FIQ
        .balign 0x80
        b _hang

        // SError
        .balign 0x80
        b _hang
And finally here is how I load the vector table:

Code: Select all

//"================================================================"
//  Set up exception handlers
//"================================================================"
        ldr     x0, =_vectors
        msr     vbar_el1, x0


Surely that would have halted the CPU so that it could be examined right?

lpoulain
Posts: 24
Joined: Mon May 20, 2019 12:35 am

Re: Raspbery Pi 3 Undefined instruction

Wed Jul 03, 2019 12:36 am

You might want to test your handler by causing an exception (like accessing a non-mapped address)

Otherwise you can find examples of handler on https://github.com/bztsrc/raspi3-tutori ... exceptions or https://github.com/LdB-ECM/Raspberry-Pi ... 3Interrupt

LizardLad_1
Posts: 133
Joined: Sat Jan 13, 2018 12:29 am

Re: Raspbery Pi 3 Undefined instruction

Wed Jul 03, 2019 5:30 am

I'm sorry I wasn't clear about the last update. I forgot to mention that the same error occurs and that I thought surely it would have worked. So clearly something in the table is broken and if anyone can see what it is please point it out.

LdB
Posts: 1169
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspbery Pi 3 Undefined instruction

Wed Jul 03, 2019 7:25 am

I think 0x200 is simply the vector table address you add the VBAR to it ... let me have a look
Probably try moving where you set the VBAR down to when you have switched to EL1 rather than up in EL2 just in case QEMU has a bug. That is just a random thought I will take a look tonight.

VBAR_ELn
+ 0x000 Synchronous Current EL with SP0
+ 0x080 IRQ/vIRQ
+ 0x100 FIQ/vFIQ
+ 0x180 SError/vSError

+ 0x200 Synchronous Current EL with SPx
+ 0x280 IRQ/vIRQ
+ 0x300 FIQ/vFIQ
+ 0x380 SError/vSError

+ 0x400 Synchronous lower EL using AArch64
+ 0x480 IRQ/vIRQ
+ 0x500 FIQ/vFIQ
+ 0x580 SError/vSError

+ 0x600 Synchronous Lower EL using AArch32
+ 0x680 IRQ/vIRQ
+ 0x700 FIQ/vFIQ
+ 0x780 SError/vSError

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Raspbery Pi 3 Undefined instruction

Wed Jul 03, 2019 11:08 am

Also I note you are using AARCH64, something you had not previously indicated. I believe the setup for the vectors is a bit more involved in AARCH64.

Though I would use a relitive branch to code that loads the address of the handler into a register, then does an absolute branch to the address so loaded. Then copy your vector table down to address 0x00000000.

Also how do you setup your stacks?
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

LizardLad_1
Posts: 133
Joined: Sat Jan 13, 2018 12:29 am

Re: Raspbery Pi 3 Undefined instruction

Thu Jul 04, 2019 2:58 am

Thanks to both of you for helping.

DavidS my stacks are set up almost the same as LdB and are as follows:

Code: Select all

//"================================================================"
//  Set up stack pointers
//"================================================================"
	mrs 	x1, mpidr_el1	// Read core id on AARCH64
	and 	x1, x1, #0x3	// Make core 2 bit bitmask in x1
	cmp	x1, #0
	beq	Core0StackPointers
	cmp	x1, #1
	beq	Core1StackPointers
	cmp	x1, #2
	beq	Core2StackPointers
	cmp	x1, #3
	beq	Core3StackPointers
	b	_hang

//Core 1 stack pointers
Core1StackPointers:
	ldr	x1, =__EL0_stack_core1
	ldr     x2, =__EL1_stack_core1
	ldr	x3, =__EL2_stack_core1
	msr	sp_el0, x1
	msr     sp_el1, x2
	mov	sp, x3
	b 	EL2_ret

//Core 2 stack pointers
Core2StackPointers:
	ldr	x1, =__EL0_stack_core2
	ldr	x2, =__EL1_stack_core2
	ldr	x3, =__EL2_stack_core2
	msr	sp_el0, x1
	msr     sp_el1, x2
	mov	sp, x3
	b 	EL2_ret

//Core 3 stack pointers
Core3StackPointers:
	ldr	x1, =__EL0_stack_core3
	ldr	x2, =__EL1_stack_core3
	ldr	x3, =__EL2_stack_core3
	msr	sp_el0, x1
	msr     sp_el1, x2
	mov	sp, x3
	b	EL2_ret

//Core 0 stack pointers
Core0StackPointers:
	ldr	x1, =__EL0_stack_core0
	ldr     x2, =__EL1_stack_core0
	ldr	x3, =__EL2_stack_core0
	msr	sp_el0, x1
	msr     sp_el1, x2
	mov	sp, x3
	b	EL2_ret
EDIT 1:
Forgot to include the linker file sorry!!

Code: Select all

SECTIONS
{
	. = 0x80000;
	_start = .;
	
	.text : 
	{
		KEEP(*(.text.boot)) *(.text .text.* .gnu.linkonce.t*) 
	}
	
	.rodata : 
	{ 
		*(.rodata .rodata.* .gnu.linkonce.r*) 
	}
	
	PROVIDE(_data = .);
	.data :
	{ 
		*(.data .data.* .gnu.linkonce.d*) 
	}
	
	.bss (NOLOAD) : 
	{
		. = ALIGN(16);
		__bss_start = .;
		*(.bss .bss.*)
		*(COMMON)
		__bss_end = .;
	}

	.stack_core0 :
	{
		. = ALIGN(16);  /* Stack must always be aligned to 8 byte boundary AAPCS64 call standard */
		__stack_start_core0__ = .;
		. = . + 512;    /* EL0 stack size */
		__EL0_stack_core0 = .;
		. = . + 32768;  /* EL1 stack size */
		__EL1_stack_core0 = .;
		. = . + 4096;  /* EL2 stack size (start-up) */
		__EL2_stack_core0 = .;
		. = ALIGN(16);
		__stack_end_core0__ = .;
	}

	.stack_core1 :
	{
		. = ALIGN(16);  /* Stack must always be aligned to 8 byte boundary AAPCS64 call standard */
		__stack_start_core1__ = .;
		. = . + 512;    /* EL0 stack size */
		__EL0_stack_core1 = .;
		. = . + 32768;    /* EL1 stack size */
		__EL1_stack_core1 = .;
		. = . + 4096;  /* EL2 stack size (start-up) */
		__EL2_stack_core1 = .;
		. = ALIGN(16);
		__stack_end_core1__ = .;
	}

	.stack_core2 :
	{
		. = ALIGN(16);  /* Stack must always be aligned to 8 byte boundary AAPCS call standard */
		__stack_start_core2__ = .;
		. = . + 512;    /* EL0 stack size */
		__EL0_stack_core2 = .;
		. = . + 32768;    /* EL1 stack size */
		__EL1_stack_core2 = .;
		. = . + 4096;  /* EL2 stack size (start-up) */
		__EL2_stack_core2 = .;
		. = ALIGN(16);
		__stack_end_core2__ = .;
	}

	.stack_core3 :
	{
		. = ALIGN(16);  /* Stack must always be aligned to 8 byte boundary AAPCS call standard */
		__stack_start_core3__ = .;
		. = . + 512;    /* EL0 stack size */
		__EL0_stack_core3 = .;
		. = . + 32768;    /* EL1 stack size */
		__EL1_stack_core3 = .;
		. = . + 4096;  /* EL2 stack size (start-up) */
		__EL2_stack_core3 = .;
		. = ALIGN(16);
		__stack_end_core3__ = .;
	}

	_end = .;

	/DISCARD/ : { *(.comment) *(.gnu*) *(.note*) *(.eh_frame*) }
}
__bss_size = (__bss_end - __bss_start)>>3;

LdB
Posts: 1169
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspbery Pi 3 Undefined instruction

Thu Jul 04, 2019 3:50 am

Code doesn't seem to do anything wrong on real Pi3, I get the message "made it this far" etc

What exactly goes wrong?

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Raspbery Pi 3 Undefined instruction

Thu Jul 04, 2019 3:58 am

Have you tried it on real HW? The emulators do not always work out to correct, that is just part of life in Bare Metal Programming.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

bzt
Posts: 374
Joined: Sat Oct 14, 2017 9:57 pm

Re: Raspbery Pi 3 Undefined instruction

Thu Jul 04, 2019 9:26 am

Hi,

Welcome back!
LizardLad_1 wrote:
Tue Jul 02, 2019 1:16 am
It repeats again and again. I don't know how to make QEMU pause after the first exception so I don't know the origin.
Yeah, very annoying, right? That's why I've created this qemu patch. Try it. The first step to debug your issue is to locate the faulting code that triggers the very first exception.

Cheers,
bzt

Return to “Bare metal, Assembly language”