turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Weird behaviour of the "official" gcc

Wed Nov 25, 2015 7:28 pm

I came across some weird behaviour of the "official" gcc:
arm-linux-gnueabihf-gcc (crosstool-NG linaro-1.13.1-4.8-2014.01 - Linaro GCC 2013.11) 4.8.3 20140106 (prerelease)
It doesn't detect same errors with different optimizations, and also sectioning changes.
With -O0 I get errors about:

Code: Select all

inline instr_next_addr_t set_addr_lin(void)
{
	instr_next_addr_t retval = {
			.flag = INSTR_ADDR_ARM,
			.address = 0xffffffff
	};
	return retval;
}
from all places that it's called:
undefined reference to `set_addr_lin'
With -O1 it compiles fine - an even works.
(Missing 'static' in front of 'inline'. The code is in a .c-file.)

Also, my program didn't start if optimization was below O2:
It copies code from one place to another, and with -O2, gcc places 'main' in its own section:

Code: Select all

*(.text.startup)
 .text.startup  0x000083d0      0x3fc ./loader.o
                0x000083d0                main
*loader.o(.text)
.text 0x1f000000 0x360 ./loader.o
0x1f000000 loader_main
*rpi2.o(.text)

but with optimization level below 2 it doesn't.

Code: Select all

*loader.o(.text)
 .text          0x1f000000      0x71c ./loader.o
                0x1f000000                loader_main
                0x1f000334                main
That change makes the code copying to fail.

Is those known issues or should I report them somewhere?
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Wed Nov 25, 2015 10:05 pm

When you use -O2 and above you enable function reordering, I think this is when it starts using subsections. The .text.startup section is used for static constructors and main() so the linker can put all the stuff that only gets called once at startup together. You'll need to modify your linker script so that all subsections of .text get put together.

There's also .text.exit for destructors, .text.hot for functions called often and .text.unlikey for functions called rarely (the last two are determined when using profiling).
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Thu Nov 26, 2015 12:33 am

It's bad for code-copy if the code to copy moves around depending on the optimization. :?

I guess that the other issue (non-static inline) doesn't have such explanation?
With -O0 compiler gives error, with -O1 it compiles and runs fine.
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Thu Nov 26, 2015 9:33 am

You could try -fno-reorder-functions to see if that stops it changing sections, or use

Code: Select all

int main(int argc, char *argv[]) __attribute__ ((section (".text")))
to force main into .text[/s] (doesn't work)

Not sure about the inline stuff, I thought inline was practically ignored under -O0, will look into it tonight.
Last edited by Paeryn on Thu Nov 26, 2015 5:29 pm, edited 1 time in total.
She who travels light — forgot something.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Thu Nov 26, 2015 5:28 pm

Update:
Forcing the section name with __attribute__ doesn't seem to work on main().
Passing -fno-reorder-functions (I forgot the hyphen after no in my previous post) when using -O2 did stop it putting main() into .text.startup but you'll still have problems with constructors, destructors and inline functions getting forced into subsections.

As for the function going missing when using inline and -O0, the function won't have been inlined, but all inlined functions get put into their own subsections, e.g.

Code: Select all

inline int var(int x)
{
  return x + 2;
}
gets put into a section named .text._Z3vari (_Z3vari is the C++ mangled function name), therefore if you didn't include all the .text subsections in your linker script then the function won't have been included.

The default linker script collects all the .text subsections together so you'll need something along similar lines :-

Code: Select all

  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
  }
Update (again), gcc shouldn't put the inline functions in their own sections, it's g++ that does that. They both put main() into .text.startup under -O2 though.
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Fri Nov 27, 2015 12:11 am

Well, I'm using C (and asm) not C++, so I don't care about constructors or destructors.

What I wondered about the inline is that with -O0 it gave an error (and didn't produce a .elf), but with -O2 it didn't give any errors or warnings, but compiled and run well.

My code placement looks like this:

Code: Select all

    .text :
    {
        *(.init)
        *start1.o(.text)
        *start1.o(.data)
        *start1.o(.bss)
        *(.text.startup)
    } >LOAD
    
    /* .text2 ALIGN(0x1000):  - the ">EXEC AT>LOAD" didn't like "ALIGN(0x1000)" */
    .text2 :
    {
    	. = ALIGN(.,0x8);
        *loader.o(.text)
        *rpi2.o(.text)
        *serial.o(.text)
        *util.o(EXCLUDE_FILE(*instr_util.o).text)
        *gdb.o(.text)
        *(.text)
    } >EXEC AT>LOAD
    __text_end = .;
    __load_start = LOADADDR(.text2);
    __code_begin = ADDR(.text2);
... (.data & other stuff)

    __hivec_load = LOADADDR(.hivec);
    __load_end = LOADADDR(.hivec) + SIZEOF(.hivec);
and I copy the code from __load_start .. __load_end to __code_begin.

I started with -O2 and expected the 'main' to be in .text.startup, but I guess I'd better re-organize the stuff so that the placement is roughly the same with or without 'main' being in the .text.startup

I think it's enough to swap the order of 'main' and another function in the .c file, and put .text.startup in the .text2 as well. Because 'main' only sets up things, I thought it would have been a good idea to discard it with the other initialization code.

I still wonder why the program didn't boot with 'main' going into .text2 when it worked fine with 'main' in .text.startup (and put in .text).
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Fri Nov 27, 2015 1:11 am

Sorry, I thought you didn't want main() going into .text.startup

The order of the functions in the C file shouldn't matter as the linker should be taking care of all the address translations. The only time it should matter is when you are using the resultant object file as the first item in the link.

Where are you doing the code move? If it's in the .init or start1.o part then the linker should have set the correct addresses for calling main() regardless of whether it's in the initial .text or the relocated .text2 section.

Edit:
If you want main() to always be put into .text.startup even when using -O0 you can use the function attribute to do that (it's preventing it that doesn't work, i.e. gcc seems to force the section name when optimizing). You need to declare main (you can't set the attribute on a definition, only on declaring) with something like :-

Code: Select all

int main(int argc, char *argv[]) __attribute__ ((section (".text.startup")));
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Fri Nov 27, 2015 7:09 am

The code copy happens in .init - right after dealing with the hyp-mode.

Could it be because the MMU and caches are set up in the .text2 (high-address) section in the function 'loader_main' that is called by 'main'? The compiler puts 'loader_main' before 'main'.
The whole code is here: https://github.com/turboscrew/rpi_stub
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Fri Nov 27, 2015 4:32 pm

I can't see that it should make a difference where you set the MMU as long as you don't have pre-MMU addresses live (but I think you're keeping the addresses the same anyway so that shouldn't affect it).

The only other thing I can think of off the top of my head is if it's moved a function too far for a normal branch, but I'm sure the linker is supposed to add extra instructions to compensate, maybe it is and that is messing up your code move? At what point does it fail, in main, loader_main or before/after either?

I'm a bit limited on internet access and no PC/RPi until Monday.
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Fri Nov 27, 2015 11:32 pm

The long jump from 0x8000-area to 0x1f000000-area seems to end up in reset.
Well maybe I'll try to figure it out later. That's not critical, because with -O2 it works fine and
I can probably make it work with all optimization levels with not too much effort.
I'm curious - I must admit - about the cause though...

The rpi_stub doesn't even need to be compiled - there is a ready-made binary in the repo.
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Sat Nov 28, 2015 12:24 am

I'd check the disassembly around that call to main() from start1_fun() in your final file (i.e. post-linker). When main is in the high address range then the normal relative branch that the compiler emitted will have to be extended by the linker to accommodate the absolute branch required. This is probably done by the linker creating a short helper function that the branch jumps to, it may be that somewhere this is going wrong (not sure why as it has to do it from main to loader_main when main is in the low address and you say it works then).

If you've not sorted it by Monday I'll have a more detailed look at the code when I get home, it's a bit difficult to do it on my phone.
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Sun Nov 29, 2015 12:59 pm

Putting 'main' in the .text.startup didn't help. It still crashes id -O0 is used. With -O2 it works fine.
Another weird thing is that the 'main' calls some other fúnctions (like LED blinking) in the 0x1f000000 area before trying to enter loader_main, and those calls seem to work. At least the LED blinks with the pattern indicating that 'main' is entered, and the LED blinking routines are far.

Maybe I should check between the other function calls too to see if the program actually crashes before attempting to enter 'loader_main'.

So at least 'rpi2_led_blink' and 'rpi2_delay_loop' seem to work.

Here's the assembly (-O0):

Code: Select all

000083fc <main>:
    83fc:	e92d4810 	push	{r4, fp, lr}
    8400:	e28db008 	add	fp, sp, #8
    8404:	e24dd01c 	sub	sp, sp, #28
    8408:	e50b0018 	str	r0, [fp, #-24]
    840c:	e50b101c 	str	r1, [fp, #-28]
    8410:	e50b2020 	str	r2, [fp, #-32]
    8414:	e3a03000 	mov	r3, #0
    8418:	e50b3010 	str	r3, [fp, #-16]
    841c:	ea00000e 	b	845c <main+0x60>
    8420:	e3a00f7d 	mov	r0, #500	; 0x1f4
    8424:	e3a01f7d 	mov	r1, #500	; 0x1f4
    8428:	e3a02003 	mov	r2, #3
    842c:	eb0001ab 	bl	8ae0 <__rpi2_led_blink_veneer>
    8430:	e3a00ffa 	mov	r0, #1000	; 0x3e8
    8434:	eb0001a3 	bl	8ac8 <__rpi2_delay_loop_veneer>
    8438:	e3a00ffa 	mov	r0, #1000	; 0x3e8
    843c:	e3a01ffa 	mov	r1, #1000	; 0x3e8
    8440:	e3a02003 	mov	r2, #3
    8444:	eb0001a5 	bl	8ae0 <__rpi2_led_blink_veneer>
    8448:	e3010388 	movw	r0, #5000	; 0x1388
    844c:	eb00019d 	bl	8ac8 <__rpi2_delay_loop_veneer>
    8450:	e51b3010 	ldr	r3, [fp, #-16]
    8454:	e2833001 	add	r3, r3, #1
    8458:	e50b3010 	str	r3, [fp, #-16]
    845c:	e51b3010 	ldr	r3, [fp, #-16]
    8460:	e3530004 	cmp	r3, #4
    8464:	daffffed 	ble	8420 <main+0x24>
    8468:	e30c306c 	movw	r3, #49260	; 0xc06c
    846c:	e3413f02 	movt	r3, #7938	; 0x1f02
    8470:	e3a02000 	mov	r2, #0
    8474:	e5832000 	str	r2, [r3]
    8478:	e30c3118 	movw	r3, #49432	; 0xc118
    847c:	e3413f02 	movt	r3, #7938	; 0x1f02
    8480:	e3a02000 	mov	r2, #0
    8484:	e5832000 	str	r2, [r3]
    8488:	e30c3068 	movw	r3, #49256	; 0xc068
    848c:	e3413f02 	movt	r3, #7938	; 0x1f02
    8490:	e3a02000 	mov	r2, #0
    8494:	e5832000 	str	r2, [r3]
    8498:	e30c3070 	movw	r3, #49264	; 0xc070
    849c:	e3413f02 	movt	r3, #7938	; 0x1f02
    84a0:	e3a02cc2 	mov	r2, #49664	; 0xc200
    84a4:	e3402001 	movt	r2, #1
    84a8:	e5832000 	str	r2, [r3]
    84ac:	e30436cc 	movw	r3, #18124	; 0x46cc
    84b0:	e3413f03 	movt	r3, #7939	; 0x1f03
    84b4:	e3a02001 	mov	r2, #1
    84b8:	e5832000 	str	r2, [r3]
    84bc:	e30436e0 	movw	r3, #18144	; 0x46e0
    84c0:	e3413f03 	movt	r3, #7939	; 0x1f03
    84c4:	e3a02000 	mov	r2, #0
    84c8:	e5832000 	str	r2, [r3]
    84cc:	e30c3110 	movw	r3, #49424	; 0xc110
    84d0:	e3413f02 	movt	r3, #7938	; 0x1f02
    84d4:	e3a02000 	mov	r2, #0
    84d8:	e5832000 	str	r2, [r3]
    84dc:	e3043680 	movw	r3, #18048	; 0x4680
    84e0:	e3413f03 	movt	r3, #7939	; 0x1f03
    84e4:	e3a02000 	mov	r2, #0
    84e8:	e5832000 	str	r2, [r3]
    84ec:	e30803c0 	movw	r0, #33728	; 0x83c0
    84f0:	e3410f02 	movt	r0, #7938	; 0x1f02
    84f4:	eb000177 	bl	8ad8 <__rpi2_get_cmdline_veneer>
    84f8:	e3a03000 	mov	r3, #0
    84fc:	e50b3010 	str	r3, [fp, #-16]
    8500:	ea00016a 	b	8ab0 <main+0x6b4>
    8504:	e30833c0 	movw	r3, #33728	; 0x83c0
    8508:	e3413f02 	movt	r3, #7938	; 0x1f02
    850c:	e51b2010 	ldr	r2, [fp, #-16]
    8510:	e0833002 	add	r3, r3, r2
    8514:	e5d33000 	ldrb	r3, [r3]
    8518:	e3530072 	cmp	r3, #114	; 0x72
    851c:	1a000160 	bne	8aa4 <main+0x6a8>
    8520:	e51b2010 	ldr	r2, [fp, #-16]
    8524:	e30833c0 	movw	r3, #33728	; 0x83c0
    8528:	e3413f02 	movt	r3, #7938	; 0x1f02
    852c:	e0823003 	add	r3, r2, r3
    8530:	e30a0d7c 	movw	r0, #44412	; 0xad7c
    8534:	e3410f01 	movt	r0, #7937	; 0x1f01
    8538:	e1a01003 	mov	r1, r3
    853c:	eb000169 	bl	8ae8 <__util_cmp_substr_veneer>
    8540:	e1a04000 	mov	r4, r0
    8544:	e30a0d7c 	movw	r0, #44412	; 0xad7c
    8548:	e3410f01 	movt	r0, #7937	; 0x1f01
    854c:	eb000169 	bl	8af8 <__util_str_len_veneer>
    8550:	e1a03000 	mov	r3, r0
    8554:	e1540003 	cmp	r4, r3
De-bugging is for sissies - real men do de-monstrations.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Sun Nov 29, 2015 1:13 pm

I compared the asm generated with -O0 and -O2 (the veneers) and they looked similar - of course addresses were changed, but looked believable, so I think the problem must be in some of those other calls... I'll keep digging...

Yes, it seems to crash in the first string function. Funny that i does survive the command line fetching and continues on. The command line fetching is also far.

Code: Select all

void main(uint32_t r0, uint32_t r1, uint32_t r2)
{
	/* device tree parameters - for future use? */
	(void) r0;
	(void) r1;
	(void) r2;
	
	int i;
	int tmp;
		
#if 1
	for(i=0; i<3; i++) // This was seen
	{
		rpi2_led_blink(500, 500, 3);
		rpi2_delay_loop(1000);
		rpi2_led_blink(1000, 1000, 3);
		rpi2_delay_loop(5000);
	}
#endif
	rpi2_uart0_excmode = RPI2_UART0_POLL; // default
	rpi2_use_mmu = 0; // default - no mmu
	rpi2_keep_ctrlc = 0; // no forced ctrl-c enabling
	rpi2_uart0_baud = 115200;
	rpi2_use_hw_debug = 1;
	rpi2_print_dbg_info = 0;
	rpi2_neon_used = 0;
	rpi2_neon_enable = 0;
	
	rpi2_get_cmdline(cmdline); // 1f00323c T rpi2_get_cmdline
#if 1
	for(i=0; i<3; i++) // This was seen
	{
		rpi2_led_blink(250, 250, 3);
		rpi2_delay_loop(1000);
		rpi2_led_blink(500, 500, 2);
		rpi2_delay_loop(3000);
	}
#endif
	
	for (i=0; i< 1024; i++)
	{
		if (cmdline[i] == '\0') break;
		if (cmdline[i] == 'r')
		{
// This probably crashes it: 1f005c34 T util_cmp_substr, 1f005b74 T util_str_len
			if (util_cmp_substr("rpi_stub_", cmdline + i) >= util_str_len("rpi_stub_")) 
			{
#if 1
		rpi2_led_blink(250, 250, 3); // this was never seen
		rpi2_delay_loop(1000);
		rpi2_led_blink(100, 500, 3);
		rpi2_delay_loop(1000);
#endif
				i += util_str_len("rpi_stub_");
				if (util_cmp_substr("mmu", cmdline + i) >= util_str_len("mmu"))
				{
					i += util_str_len("mmu");
					// set flag for mmu setup
					// rpi_stub_mmu
					rpi2_use_mmu = 1;
				}
				else if (util_cmp_substr("interrupt=", cmdline + i) >= util_str_len("interrupt="))
				{
#if 1
	for(i=0; i<3; i++)
	{
		rpi2_led_blink(500, 500, 4);
		rpi2_delay_loop(3000);
	}
#endif
					i += util_str_len("interrupt=");
					if (util_cmp_substr("irq", cmdline + i) >= util_str_len("irq"))
					{
						i += util_str_len("irq");
						// set flag for UART0 using IRQ
						rpi2_uart0_excmode = RPI2_UART0_IRQ;
					}
					else if (util_cmp_substr("fiq", cmdline + i) >= util_str_len("fiq"))
					{
						i += util_str_len("fiq");
						// set flag for UART0 using FIQ
						rpi2_uart0_excmode = RPI2_UART0_FIQ;
					}
					else if (util_cmp_substr("poll", cmdline + i) >= util_str_len("poll"))
					{
						i += util_str_len("poll");
						// set flag for UART0 using poll
						// rpi_stub_interrupt=poll
						rpi2_uart0_excmode = RPI2_UART0_POLL;
					}
				}
				else if (util_cmp_substr("keep_ctrlc", cmdline + i) >= util_str_len("keep_ctrlc"))
				{
#if 1
	for(i=0; i<3; i++)
	{
		rpi2_led_blink(500, 500, 5);
		rpi2_delay_loop(3000);
	}
#endif
					// rpi_stub_keep_ctrlc
					i += util_str_len("keep_ctrlc");
					rpi2_keep_ctrlc = 1;
				}
				else if (util_cmp_substr("baud=", cmdline + i) >= util_str_len("baud="))
				{
#if 1
	for(i=0; i<3; i++)
	{
		rpi2_led_blink(500, 500, 6);
		rpi2_delay_loop(3000);
	}
#endif
					// rpi_stub_baud=115200
					i += util_str_len("baud=");
					// get baud for serial
					i += util_read_dec(cmdline + i, &tmp);
					rpi2_uart0_baud = tmp;
				}
				else if (util_cmp_substr("hw_dbg=", cmdline + i) >= util_str_len("hw_dbg="))
				{
#if 1
	for(i=0; i<3; i++)
	{
		rpi2_led_blink(500, 500, 7);
		rpi2_delay_loop(3000);
	}
#endif
					// rpi_stub_hw_dbg=1
					i += util_str_len("hw_dbg=");
					// get baud for serial
					i += util_read_dec(cmdline + i, &tmp);
					if (tmp == 1) rpi2_use_hw_debug = 1;
					else if (tmp == 0) rpi2_use_hw_debug = 0;
					// else ignore
				}
				else if (util_cmp_substr("dbg_info", cmdline + i) >= util_str_len("dbg_info"))
				{
					// rpi_stub_dbg_info
					i += util_str_len("dbg_info");
					rpi2_print_dbg_info = 1;
					// else ignore
				}
				else if (util_cmp_substr("use_neon", cmdline + i) >= util_str_len("use_neon"))
				{
					// rpi_stub_use_neon
					i += util_str_len("use_neon");
					rpi2_neon_used = 1;
					// else ignore
				}
				else if (util_cmp_substr("enable_neon", cmdline + i) >= util_str_len("enable_neon"))
				{
					// rpi_stub_enable_neon
					i += util_str_len("enable_neon");
					rpi2_neon_enable = 1;
					rpi2_neon_used = 1;
					// else ignore
				}
				
			}
		}
	}
	loader_main();
}
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Mon Nov 30, 2015 6:19 pm

I've looked through the generated code but I can't see anything jumping out at me, the only thought is if the stack is too small and clobbering something (without optimizations gcc stuffs loads of values onto the stack at each function). Unable to test as I don't have an RPi2 (must treat myself for xmas).
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Mon Nov 30, 2015 11:17 pm

That came to my mind too - as I didn't find anything suspicious from the assembly.
I already doubled the stack but haven't tried it out yet. I wanted to get the Neon support
finished first. I'll continue digging...
De-bugging is for sissies - real men do de-monstrations.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Tue Dec 01, 2015 10:37 am

It looks like the problem is that when compiled with '-mfpu=neon-vfpv4' the compiler uses Neon for long longs. And the Neon wasn't enabled yet.

Code: Select all

382:../util.c     **** 	tmp2 = 0LL;
 1582              		.loc 1 382 0
 1583 0e18 1000C0F2 		vmov.i32	d16, #0  @ di
 1584 0e1c 050B4BED 		fstd	d16, [fp, #-20]	@ int
 383:../util.c     **** 	while (*str != '\0')
 1585              		.loc 1 383 0
 1586 0e20 2E0000EA 		b	.L98
with '-O2' it doesn't use Neon.

I think I can manage without long longs here.
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Tue Dec 01, 2015 10:03 pm

turboscrew wrote:It looks like the problem is that when compiled with '-mfpu=neon-vfpv4' the compiler uses Neon for long longs. And the Neon wasn't enabled yet.

Code: Select all

382:../util.c     **** 	tmp2 = 0LL;
 1582              		.loc 1 382 0
 1583 0e18 1000C0F2 		vmov.i32	d16, #0  @ di
 1584 0e1c 050B4BED 		fstd	d16, [fp, #-20]	@ int
 383:../util.c     **** 	while (*str != '\0')
 1585              		.loc 1 383 0
 1586 0e20 2E0000EA 		b	.L98
with '-O2' it doesn't use Neon.

I think I can manage without long longs here.
Ahh gcc's wonderful NEON handling, the only thing it uses it for in that function is to load the initial 0 value onto the stack. If you wanted a compiler workaround then you could compile util.c with -mfpu=vfpv4, that doesn't allow the compiler to emit NEON code but would keep floating-point compatibility of the object files.
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Wed Dec 02, 2015 2:51 pm

Well, I don't really need all the 32 bits for the baud rate as a decimal number. Now I just can't use the highest bit, but it's really not a problem.

There were some other stupid things as well, like inline assembly that was sensitive for messing up LR.
Changed them to naked and things started working.

Thanks for support.
De-bugging is for sissies - real men do de-monstrations.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Fri Dec 04, 2015 8:12 am

The difference of error checking with different optimization levels is still odd.
Should that be reported somewhere?
De-bugging is for sissies - real men do de-monstrations.

JacobL
Posts: 76
Joined: Sun Apr 15, 2012 2:23 pm

Re: Weird behaviour of the "official" gcc

Fri Dec 04, 2015 8:31 pm

turboscrew wrote:The difference of error checking with different optimization levels is still odd.
Should that be reported somewhere?
Actually, this is pretty standard behaviour for most compilers. Many warnings/errors are generated as side effects of the analysis done for optimisations. And it runs more analysis at higher optimisation levels. If you want the code checked thoroughly no matter the optimisation level, you probably want to run some kind of lint on it. Or use the Clang analyser.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Sat Dec 05, 2015 12:34 am

Just looked a bit more into the inline thing. You specified -std=gnu11 which brings in C99 rules for inline. According to that, if you specify a function with inline without specifying static or extern then the compiler assumes that there will be a definition available in an external file that it can call when it can't inline the provided inline version and so doesn't generate code for the function (since you can't have two copies of the actual function). When you specify -O0 then no inlining will happen so the compiler has just generated a function call, but since no other file provides the non-inlined version of the function you ended up with the undefined function problem. When you use -O2 it has managed to inline the code, no function calls needed and so no link errors.

When you (correctly) use static inline then it knows that the function only exists in the current file and so generates the function code. Similarly if you had used extern inline it would also generate the code as then you are saying that this is the externally visible definition.
She who travels light — forgot something.

turboscrew
Posts: 174
Joined: Sat Jan 18, 2014 1:50 pm
Location: Nokia (town), Finland

Re: Weird behaviour of the "official" gcc

Sat Dec 05, 2015 1:16 pm

But I think something doesn't add up: optimization level shouldn't affect the language.
@JacobL: the problemis inverse: the errors are found with lower optimization level.

I think a compiler shouldn't accept erroneous code even if it can compile it (making some assumptions) independent of optimization level. Funny that with -O2 there wasn't even warnings...
De-bugging is for sissies - real men do de-monstrations.

User avatar
Paeryn
Posts: 2677
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Weird behaviour of the "official" gcc

Sat Dec 05, 2015 2:06 pm

The problem lies in your understanding of how inline is handled under the three possibilities. If you don't explicitly state linkage then it behaves as a purely inline version, it will only ever generate inline code and under -O0 the compiler won't generate inline code. The compiler compiles the code perfectly well without errors, it is the linker that complains that it cannot find the non-inlined function, and that is because you never defined a non-inline version of the function. The compiler and linker are working exactly as they should under both optimised and unoptimised cases, it is your code (without static or extern) that is in error in that you were supposed to have supplied a non-inline version that is externally linkable in another file/library to use.

There is no way the compiler can warn under -O2. It cannot know that a non-inlined version isn't defined in another compilation unit. The linker cannot complain if nothing asks for the non existant function.
She who travels light — forgot something.

Return to “Bare metal, Assembly language”