Something very strange. I'm using Circle OS, doing bare metal programming. Very usefully, it produces a stack trace when the whole thing goes wrong. At first the stack trace was a mystery, but then it kinda started making a little sense when I looked at the disassembly, I'm rubbish at reading disassembly.
I have a routine called process_token() which calls, or should call, embed_literal(). Except, when I examined the disassembly, embed_literal() wasn't apparently called. So I thought, well, hmm, maybe the compiler optimised it out. When I added a puts() after the line that read embed_literal(), my program didn't crash.
This is extraordinarily odd for 2 reasons:
1. I can't see how puts() could in any way interact with other parts of memory and make faulty logic work
2. I tested the code on a Linux box, compiling to x86 Linux executable, and it worked.
My program does fiddle around with memory a lot, so it's entirely plausible that the thing could crash. Converting from Linux x86 to ARM bare obviously opens up potential problems, but like I say, there shouldn't be any glaring errors, because the code worked.
So, I decided to recompile my bare metal code, this time turning off optimisations. Upon disasembly, I saw embed_literal() being called under process_token().
And, when I booted my Pi to this new image, everything worked as expected.
Bizarre!
Comments?
I was using arm-none-eabi-g++ (15:7-2018-q2-6) 7.3.1 20180622 (release) [ARM/embedded-7-branch revision 261907] under Debian Stable.
Re: Optimisation causes crashes (??)
What are your optimisations when it does and doesn't crash ? There are some bits of Circle which have to be -O2, though they seem to be getting less and less.
-
- Posts: 55
- Joined: Thu Jun 06, 2019 6:07 pm
Re: Optimisation causes crashes (??)
I appreciate all the clues you provided, that was helpful. Based on the behavior (add a function call as well as optimization off works, etc.) I would suspect an issue with the stack. A local variable that is exceeding its size allocation, such as storing a string in an array that is not large enough? These issues can be difficult to pinpoint as you typically only see symptoms, not the actual problem. Good luck.blippy wrote: ↑Mon Nov 16, 2020 9:50 pm
This is extraordinarily odd for 2 reasons:
1. I can't see how puts() could in any way interact with other parts of memory and make faulty logic work
2. I tested the code on a Linux box, compiling to x86 Linux executable, and it worked.
...
So, I decided to recompile my bare metal code, this time turning off optimisations. Upon disasembly, I saw embed_literal() being called under process_token().
And, when I booted my Pi to this new image, everything worked as expected.
Bizarre!
Comments?
I was using arm-none-eabi-g++ (15:7-2018-q2-6) 7.3.1 20180622 (release) [ARM/embedded-7-branch revision 261907] under Debian Stable.
Re: Optimisation causes crashes (??)
Formerly, I didn't touch any optimisation parameters. I just did a compile using Circle defaults.
Then I changed Rules.mk to add the CFLAGS -O0. Scanning through it, I do see that it has an OPTIMIZE variable, though, which might have been a more logical place to put it.
Maybe there's a mixture of optimisation levels between the libraries and my own code now.
Re: Optimisation causes crashes (??)
Sounds fun.sean.lawless wrote: ↑Tue Nov 17, 2020 1:56 amBased on the behavior (add a function call as well as optimization off works, etc.) I would suspect an issue with the stack. A local variable that is exceeding its size allocation, such as storing a string in an array that is not large enough?

My program implements my own home-brewed Forth. Consequently, there's plenty of peeking and poking of memory, with ample scope for things to go horribly wrong. Counter to that, though, the code does also compile to a native x86 executable with address sanitisation enabled. There's also a small test suite. So the code should have had a pretty reasonable shake-down.
It's still possible to crash the Forth - it's ALWAYS possible to crash Forth - but a lot of the lurking bugs I would expect to have been exposed by now.
It's probably going to take some kind of minor miracle to sort it out, by the looks. I guess the good news is that it now works.
One thing I didn't mention is that I switched the compiler to -std=c++17 from C++14. I do use an inline variable in a header file. I don't see that it should be a problem though, as it's part of the C++17 (not C++14) standard.
Re: Optimisation causes crashes (??)
It's a common way, to build the Circle libraries using -O2, which is the default and the application code with a different setting (e.g. -O3). You only have to add the following to the application's Makefile:
Code: Select all
...
OPTIMIZE = -O3
include $(CIRCLEHOME)/Rules.mk
...

Re: Optimisation causes crashes (??)
Ah, OK. Good tip. Thanks for that.rst wrote: ↑Tue Nov 17, 2020 2:48 pmYou only have to add the following to the application's Makefile:
Code: Select all
... OPTIMIZE = -O3 include $(CIRCLEHOME)/Rules.mk ...
-
- Posts: 55
- Joined: Thu Jun 06, 2019 6:07 pm
Re: Optimisation causes crashes (??)
It's a good sign you have things working in one configuration. Persistence is helpful here. Try doing things differently, for example, compile with -O1 and see if there is any different behavior (clue). Or comment out large (or small) chunks of code... does the failure still occur? In the same place? Use GDB (-ggdb flag with -O2) and step through the problem area, checking the stack Back Trace (bt). At what point does the stack show corruption? You often can see a small problem in the stack with GDB before the whole thing blows up. The key is to make debugging fun and enjoyable.
Re: Optimisation causes crashes (??)
You definitely have an UB somewhere. It is very typical that your code works without optimization but breaks on higher optimization levels if you have an UB (undefined behaviour).
There are some other minor differences between x86 and ARM that could cause an UB, however not many (divide by zero is one). Try compiling your code and use valgrind under RaspiOS as well (not on baremetal, but under Linux). If valgrind reports errors on ARM, fix those too.
Finally try to use it with Circle. Following these three steps should solve your issue.
Cheers,
bzt
Could have been inlined.blippy wrote:Except, when I examined the disassembly, embed_literal() wasn't apparently called.
If you can do that, then compile with "-g" flag and run your code through valgrind (should be a standard package on your distro). I'm sure your code is full of memory leaks and out-of-bound indexing. Fix those until valgrind reports no errors.blippy wrote:I tested the code on a Linux box, compiling to x86 Linux executable, and it worked.
There are some other minor differences between x86 and ARM that could cause an UB, however not many (divide by zero is one). Try compiling your code and use valgrind under RaspiOS as well (not on baremetal, but under Linux). If valgrind reports errors on ARM, fix those too.
Finally try to use it with Circle. Following these three steps should solve your issue.
Cheers,
bzt
Re: Optimisation causes crashes (??)
Can I have a guess at the issue you aren't using the type uint32_t for the access or an assembler crafted code
The half word access is technically faster and if you use int or unsigned ints to access the hardware the optimizer is within it's right to replace a 32 bit write with 2 half words because it's faster. The two half word writes are still legally aligned because they are half word writes on a half word alignment and the optimizer doesn't realize it will cause a problem. In memory what the optimize does is not an issue but on the peripheral axi bus which must be 32bit access it becomes a problem. Depending how things pan out the optimizer will based on what it sees use WORD or HALF WORD access and your problem becomes random going in and out at the whim of the optimizer as you change code.
There are only two ways around the problem manually craft a PUT32 and GET32 like David Welch does in all his code or use uint32_t which has tighter alignment control than the basic types.
The half word access is technically faster and if you use int or unsigned ints to access the hardware the optimizer is within it's right to replace a 32 bit write with 2 half words because it's faster. The two half word writes are still legally aligned because they are half word writes on a half word alignment and the optimizer doesn't realize it will cause a problem. In memory what the optimize does is not an issue but on the peripheral axi bus which must be 32bit access it becomes a problem. Depending how things pan out the optimizer will based on what it sees use WORD or HALF WORD access and your problem becomes random going in and out at the whim of the optimizer as you change code.
There are only two ways around the problem manually craft a PUT32 and GET32 like David Welch does in all his code or use uint32_t which has tighter alignment control than the basic types.
-
- Posts: 55
- Joined: Thu Jun 06, 2019 6:07 pm
Re: Optimisation causes crashes (??)
Great advice from both LdB and bzt.
Re: Optimisation causes crashes (??)
Thanks guys. In the end I decided to implement things a different way.
So I've got a "mostly working" version of Forth on Circle, which is pretty cool. It fits in with the idea of a retro computing environment where you have a control language.
Not sure what the best choice is, though: it's a bit of a tossup between Basic, Jim (a version of Tcl), and lua. They're all languages with which i am unfamiliar. Except Basic.
So I've got a "mostly working" version of Forth on Circle, which is pretty cool. It fits in with the idea of a retro computing environment where you have a control language.
Not sure what the best choice is, though: it's a bit of a tossup between Basic, Jim (a version of Tcl), and lua. They're all languages with which i am unfamiliar. Except Basic.
Re: Optimisation causes crashes (??)
Taking a different tack, I was able to incorporate bzt's SD card work in with a fork of Sean's tutorials. I'm pretty excited about this because I now have a bare metal system that has a screen, keyboard, and storage.
Admittedly there's no filesystem but meh, Charles Moore doesn't need a filesystem, so it's not a hard requirement. I was thinking of, maybe, writing a very very simple, very naive one.
It's difficult to ascertain how mature FAT32 filesystem libraries are out there. SD card interfacing is another one of those complex areas, and I'm not sure if a lot of code fails because of faulty interfacing with the cards or because of faulty filesystem drivers.
My results with using SD cards with MCUs has been unreliable, to say the least.
Re: Optimisation causes crashes (??)
Taking a different tack, I was able to incorporate bzt's SD card work in with a fork of Sean's tutorials. I'm pretty excited about this because I now have a bare metal system that has a screen, keyboard, and storage.
Admittedly there's no filesystem but meh, Charles Moore doesn't need a filesystem, so it's not a hard requirement. I was thinking of, maybe, writing a very very simple, very naive one.
It's difficult to ascertain how mature FAT32 filesystem libraries are out there. SD card interfacing is another one of those complex areas, and I'm not sure if a lot of code fails because of faulty interfacing with the cards or because of faulty filesystem drivers.
My results with using SD cards with MCUs has been unreliable, to say the least.