LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Mon Aug 20, 2018 2:06 pm

Okay I am working on your code now. There are a number of bugs with your code especially in the stepper unit .. wrong names etc.
Just taking me a minute to sort them out.

Update: Ok I see you have fixed them all.

Okay 4 core with MMU online done
https://github.com/LdB-ECM/Raspi3-Kernel
main.c, core1,core2,core3, multicore changed to use the new MMU unit.

The semaphore works the quick test I did flashes up in between all the other text. (My precompiled img is there)

You can now semaphore the printf function :-)

I will also give you a warning you won't be able to hit the Pi mailbox once the MMU is on without either (1.) allocating a virtual page OR (2) allocate VC memory via tag 0x0003000c prior to MMU on. If you are just doing small stuff before you turn on MMU, allocate a 4K block from the VC (the smallest possible) then build your mailbox structures in that memory area (remember it is VC address aka it has 0xC0000000 added). You wont have cache coherency issues with that memory and you will be able to exchange data with VC. If you try to pass a pointer to normal RAM like you would do normally... well lets just say good luck with the VC understanding anything in your message structure memory :-)

Now did you want me to give you the small piece of code to get the VC pipeline to clear the screen in a few microseconds?
I might be able to have it scroll the screen as well as it should be just a simple render shift onto the GL pipe ... I haven't tried but I am assuming that I am allowed to use the FB as both source and destination with a second virtual page. It supports virtual screen so I can't imagine it won't, I would image the restriction is no overlap of source and destination.

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Tue Aug 21, 2018 12:33 am

Thank you LdB for this. Yes please if you have the time would you please add the code for a blank OpenGL scene to blank the screen. Also why can't the mailbox be accessed while the mmu is on?

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Tue Aug 21, 2018 4:34 am

You need to read up on virtual memory and the MMU. I am going to try and dumb the answer down so I am going to take some liberties which later on you will work out.
A good place to start reading is the links below
https://static.docs.arm.com/100940/0100 ... 100_en.pdf
https://en.wikipedia.org/wiki/Cache_coherence

The memory the ARM core is seeing is now cached and virtualized and currently you create a message struct with the CPU that goes something like

Code: Select all

uint32_t msg[8] = { blah blah };
Then you pass a pointer to that memory structure to the VC in your mail_write function.

That won't cut muster anymore that data is only immediately valid to the core executing in it's cache (there are simplification being taken here). You now punch a message to the VC pointing to a physical address. When the VC accesses the physical memory at that address the value it reads may be different because the cache hasn't pushed the data from the core cache to the actual physical memory the VC sees.

So you either now need to
1) learn cache instructions which must be executed before you tell the VC to look
2.) Use an area that the cache is turned off. In the MMU setup the cache is off over the VC memory
3.) Use the "virtualmap" function I provided to create a virtual block without cache and set the struct up in that block.

Point 3 requires connection up to your memory allocator scheme and point 1 requires understanding of MMU. You will probably take one of those paths later on.

So for where you at learning, option 2 is the easiest. An extension of 2 is you could also turn the cache off the last 4K block of the physical ram in the MMU table setup and just under the VC memory and use that area when exchanging data with VC.

So if you look at the current code .. I have altered to drop out one block of memory from cache

Code: Select all

	/* Ram from 0 to VC base addr*/
	for (base = 0; base < msg[5] - 1; base++)  //** NOTE THE minus 1 now  
       {
		// Each block descriptor (2 MB)
		Stage2map1to1[base] = (VMSAv8_64_STAGE2_BLOCK_DESCRIPTOR) 
		{ 
			.Address = (uintptr_t)base << (21-12), 
			.AF = 1, 
			.SH = STAGE2_SH_INNER_SHAREABLE, 
			.MemAttr = MT_NORMAL, .EntryType = 1,
		};
	}
So (GPU mem start-4k) to (GPU mem start) would then be uncached and available to use as exchange with VC like you do in you current code. So you would setup your structs specifically in that area. The cpu core would take it's sorry time to write/read the values because that memory isn't cached and when you call the VC the data it sees would be correct.

Later on down the track you will adjust your graphics routines on to be cache safe and you will turn the cache on over the farmebuffer/GPU memory to speed up those routines. For now I have to leave it off or it will kill all your graphics.

Does any of that makes sense to you?

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Tue Aug 21, 2018 10:42 am

Thanks it works almost perfectly and yes with your explanation I understand. I know the console has bugs but the other issue is that the kernel hangs at some point. Not on QEMU. I believe this means there may be an unaligned access somewhere that needs to be fixed I'm investigating now.

UPDATE:
OK so I know it occurs while trying to start the other 3 cores this is for an unknown reason.

UPDATE 2:
I should also just tell everyone I need this all operational by 3pm Friday GMT+10. So far I have narrowed it down slightly. Since it is multicore now I'm not sure how to debug this completely. I have attached an image of the output of my program including the exception message.
Error.jpg
Error image
Error.jpg (225.86 KiB) Viewed 788 times
UPDATE 3:
I have pin pointed the source of the exception it occurs at 0x919B4 in the kernel.elf and that is a ldr in the mailbox_call function. Lines 44-50 mbox.c

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Tue Aug 21, 2018 2:09 pm

I am doubting that is correct ... but simply dump the assembler code and look at it for that function

aarch64-elf-objdump -d kernel.elf > somefilename.txt

What worries me is that it's a mailbox message .... is the MMU on (AKA my warnings) .... if so I need to look at your code.
You will also need memory barriers in place on the peripheral access if you have the MMU on .. you may like to try adding the dmb commands below. You might also want to consider writing the mailbox routines in assembler and take the compiler out of play.

Code: Select all

bool mailbox_tag_write(uint32_t message) 
{
	uint32_t value;	// Temporary read value
	message &= ~(0xF); // Make sure 4 low channel bits are clear
	message |= 0x8; // OR the channel bits to the value
	do
	{
               asm("dmb sy");
		value = MAILBOX_FOR_READ_WRITES->status_1; // Read mailbox1 status from GPU
	} 
	while ((value & MAIL_FULL) != 0); // Make sure arm mailbox is not full
        asm("dmb sy");
	MAILBOX_FOR_READ_WRITES->write_1 = message; // Write value to mailbox
        return true; // Write success
}

uint32_t mailbox_tag_read () 
{
	uint32_t value;	// Temporary read value
	do
	{
		do
		{ 
                        asm("dmb sy");
			value = MAILBOX_FOR_READ_WRITES->status_0; // Read mailbox0 status
		} 
		while ((value & MAIL_EMPTY) != 0); // Wait for data in mailbox
		asm("dmb sy");
                value = MAILBOX_FOR_READ_WRITES->read_0; // Read the mailbox
	}
	while ((value & 0xF) != 0x8); // We have response back
	value &= ~(0xF); // Lower 4 low channel bits are not part of message
  	asm("dmb sy");
	return value; // Return the value
}

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Tue Aug 21, 2018 10:42 pm

Yeah it is still occurring in the same place. I have added the "dmb dy" commands where there are read and writes I have updated all code on GitHub if you have a chance to look at it.

Thanks in advance.

EDIT: doesn't the dmb command flush the data cache? if so the error isn't caused by that but by an error in the translation tables on level 2.

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 1:27 am

I need you to recompile the library with strict aligned flag ( -mstrict-align )

I am throwing up a compile bug on your library when I use the flag

Code: Select all

build/fat.o: In function `fat_getcluster':
D:\Leon\Programming\Pi\Raspi3-Kernel2/src/fat.c:100: undefined reference to `memcmp'
So something in the library isn't aligned and your makefile doesn't give me the option to recompile the library

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 1:27 am

Ok I will do that now

UPDATE:
Done I have implemented memcmp instead of relying on the builtins. It now compiles correctly with that flag.

UPDATE 2:
The error occurs while trying to call the mailbox to turn the ACT LED on.

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 3:01 am

Correct if you comment the activity led call out, it works fine.
I did warn you accessing the mailbox needs car with the MMU on.
I will do an allocation fix for you and stop messing around .. you will need to work thru what I do and why.

If you are pressed for time use the GPIO port with an LED the mailbox is strictly the only issue :-)

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 3:07 am

If the dmb command flushes the data cache then wouldn't it be possible to have it use the mailbox call as is? I know I can answer my own question there and say no but why?

Oh and thank you so much for your help and support.

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 6:39 am

It's not just the cache that is the problem you have security and virtualization once you turn the MMU on.
You are passing an address in the controlled MMU range for the VC to read and write in. Not only must it be cache coherent it must have permissions to read and write in and the address translation must be valid. A fail at any of the parts and it will throw an exception.

Have you updated that library issue?

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 6:49 am

Yes the code now compiles correctly with the -mstrict-align flag instead of using builtins I just implemented memcmp myself to fix it.

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 6:51 am

No worries lets see if I can work out what the problem is.

Update: Nope it seems to be the cache, I can get a flash now
Modified .. led.c .. see where I invalidated the cache
Eventually it will hang .. need to resolve the proper cache addr update not this brutal rubbish
My other concern is you seem to be using the same mailbox array over and over, you don't use it from two different cores at same time do you?

Code: Select all

extern void invalidate_dcache(void);   // Added to start.S

void set_ACT_LED (bool on)
{
	mailbox[0] = 8*4;
	mailbox[7] = 0;
	mailbox[1] = 0;
	mailbox[2] = 0x00038041;
	mailbox[3] = 8;
	mailbox[4] = 8;
	mailbox[5] = 130;
	mailbox[6] = (uint32_t)on;
	invalidate_dcache(); // ****** Make sure the VC can see data
	mailbox_tag_write((uint32_t)(uintptr_t)&mailbox[0]);
	mailbox_tag_read();
}
The code for invalidate_dcache goes into start.S

Code: Select all

.section .text.invalidate_dcache, "ax", %progbits
.balign	4
.globl invalidate_dcache;
.type invalidate_dcache, %function
invalidate_dcache:
	dmb     ISH
	mrs     x0, CLIDR_EL1          //; x0 = CLIDR
	ubfx    w2, w0, #24, #3        //; w2 = CLIDR.LoC
	cmp     w2, #0                 //; LoC is 0?
	b.eq    invalidateCaches_end   //; No cleaning required and enable MMU
	mov     w1, #0                 //; w1 = level iterator

invalidateCaches_flush_level:
	add     w3, w1, w1, lsl #1     //; w3 = w1 * 3 (right-shift for cache type)
	lsr     w3, w0, w3             //; w3 = w0 >> w3
	ubfx    w3, w3, #0, #3         //; w3 = cache type of this level
	cmp     w3, #2                 //; No cache at this level?
	b.lt    invalidateCaches_next_level

	lsl     w4, w1, #1
	msr     CSSELR_EL1, x4         //; Select current cache level in CSSELR
	isb                            //; ISB required to reflect new CSIDR
	mrs     x4, CCSIDR_EL1         //; w4 = CSIDR

	ubfx    w3, w4, #0, #3
	add    	w3, w3, #2             //; w3 = log2(line size)
	ubfx    w5, w4, #13, #15
	ubfx    w4, w4, #3, #10        //; w4 = Way number
	clz     w6, w4                 //; w6 = 32 - log2(number of ways)

invalidateCaches_flush_set:
	mov     w8, w4                 //; w8 = Way number
invalidateCaches_flush_way:
	lsl     w7, w1, #1             //; Fill level field
	lsl     w9, w5, w3
	orr     w7, w7, w9             //; Fill index field
	lsl     w9, w8, w6
	orr     w7, w7, w9             //; Fill way field
	dc      CISW, x7               //; Invalidate by set/way to point of coherency
	subs    w8, w8, #1             //; Decrement way
	b.ge    invalidateCaches_flush_way
	subs    w5, w5, #1             //; Descrement set
	b.ge    invalidateCaches_flush_set

invalidateCaches_next_level:
	add     w1, w1, #1             //; Next level
	cmp     w2, w1
	b.gt    invalidateCaches_flush_level

invalidateCaches_end:
	ret
.balign	4
.ltorg										// Tell assembler ltorg data for this code can go here
.size	invalidate_dcache, .-invalidate_dcache

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 10:20 am

Thank you so much for your help.
LdB wrote: My other concern is you seem to be using the same mailbox array over and over, you don't use it from two different cores at same time do you?
No it is only used by one core. What do you mean by
LdB wrote: Eventually it will hang .. need to resolve the proper cache addr update not this brutal rubbish

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 10:42 am

try

Code: Select all

#define DATA_CACHE_LINE_LENGTH_MIN	64	// min(L1_DATA_CACHE_LINE_LENGTH, L2_CACHE_LINE_LENGTH)
void InvalidateDataCacheRange(uint32_t nAddress, uint32_t nLength)
{
	nLength += DATA_CACHE_LINE_LENGTH_MIN;
	while (1)
	{
		asm volatile ("dc ivac, %0" : : "r" (nAddress) : "memory");
		if (nLength < DATA_CACHE_LINE_LENGTH_MIN)
		{
			break;
		}
		nAddress += DATA_CACHE_LINE_LENGTH_MIN;
		nLength -= DATA_CACHE_LINE_LENGTH_MIN;
	}
}

void set_ACT_LED (bool on)
{
	mailbox[0] = 8*4;
	mailbox[7] = 0;
	mailbox[1] = 0;
	mailbox[2] = 0x00038041;
	mailbox[3] = 8;
	mailbox[4] = 8;
	mailbox[5] = 130;
	mailbox[6] = (uint32_t)on;
	uint32_t addr = (uint32_t)(uintptr_t)&mailbox[0];
	InvalidateDataCacheRange(addr, sizeof(mailbox)/ DATA_CACHE_LINE_LENGTH_MIN);
	mailbox_tag_write(addr);
	mailbox_tag_read();
}

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 10:58 am

Are you able to run it on real hardware? because I currently get no output and yes I am using your c version I will try your asm version now.
UPDATE:
I think I got the issue the printf function is being used on core 0 before the mmu is on. I'm fixing it now.
UPDATE 2:
Yep that was the issue. Thank you LdB for your help.
Last edited by LizardLad_1 on Wed Aug 22, 2018 11:09 am, edited 1 time in total.

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 11:08 am

Got the dam thing .. just needs a single assembler line .. i will put up on repo
void set_ACT_LED (bool on)
{
mailbox[0] = 8*4;
mailbox[7] = 0;
mailbox[1] = 0;
mailbox[2] = 0x00038041;
mailbox[3] = 8;
mailbox[4] = 0;
mailbox[5] = 130;
mailbox[6] = (uint32_t)on;
uint32_t addr = (uint32_t)(uintptr_t)&mailbox[0];
asm volatile ("dc civac, %0" : : "r" (addr) : "memory");
mailbox_tag_write(addr | 0xc0000000);
mailbox_tag_read();
}

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 11:43 am

Yes there are some nasty bugs in your pre-compiled library that I am not going to deal with.
I don't understand why you just don't use the ones that come with GCC and I am not interested in working thru it.

So here is a clean proof the code works flawlessly .. it runs all the semaphore checks
Then gets core 1 to write to sceen and turn activity led on in a loop.

Please confirm the image file works on your hardware.
https://github.com/LdB-ECM/Exchange/tre ... r/MMU_test

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 11:52 am

LdB you are a legend! Thank you so much to help me get this working! I will endeavour to learn more about the mmu and the caches. The code in my repo now works on real hardware. What do you mean by my pre compiled library? I know there are bugs in my console code there is an infinite loop and i will have to fix it.

LdB
Posts: 903
Joined: Wed Dec 07, 2016 2:29 pm

Re: bzt's debugger

Wed Aug 22, 2018 12:03 pm

Okay I knew there was a bug somewhere in the code, you hit it and all hell broke out :-)

Isn't libopenlibm.a a precompiled library?

On the bright side the best part of having problems is you do learn a fair bit. You next challenge is to work out how to get all the cores to work together either as a cluster or via tasks :-)

LizardLad_1
Posts: 126
Joined: Sat Jan 13, 2018 12:29 am

Re: bzt's debugger

Wed Aug 22, 2018 6:13 pm

Yeah libopenlibm.a is a precompiled library it is a clone of https://github.com/JuliaMath/openlibm and the reason I didn't use gcc builtins is because I couldn't find a precompiled version of libm in the repositories.

It is a great learning experience having an opportunity like this.

Return to “Bare metal, Assembly language”