TheGrumps
Posts: 41
Joined: Thu Sep 10, 2020 10:10 pm

Optimization (and UART access)

Thu Sep 17, 2020 9:50 pm

So, I've been trying to get reliable UART transmit working on a Pi Zero. I've worked through the registers in the peripherals guide, and compared my results with others that I've found on the net.
My UART transmit function:

Code: Select all

void uart_tx(unsigned char ch)
{
  unsigned int ra;
  do {
    ra=*AUX_MU_LSR_REG;
    ra&=0x20;
  } while(ra==0);
  *AUX_MU_IO_REG=ch;               
}
Only works if I put a delay (to flash a LED or something) in the while loop.
So I got thinking. And if I change the compiler optimization, was -O2 to -O0, then it will work without the delay.
Execution is obviously so much slower with -O0. I'm doing lots of screen line drawing, and it must be 10 times slower with -O0.
How do overcome my issue polling the UART TX empty status bit?
Ta for any pointers.

trejan
Posts: 2521
Joined: Tue Jul 02, 2019 2:28 pm

Re: Optimization (and UART access)

Thu Sep 17, 2020 11:10 pm

You should be using volatile for ra.

User avatar
Paeryn
Posts: 3046
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Optimization (and UART access)

Fri Sep 18, 2020 12:52 am

trejan wrote:
Thu Sep 17, 2020 11:10 pm
You should be using volatile for ra.
There's no reason for ra to be volatile, it might not make any difference. The type of AUX_MU_LSR_REG however needs to be a pointer to a volatile int.

You need the compiler to not optimize the reading of the hardware register to outside the loop, just making ra volatile doesn't guarantee that the compiler won't just move the evaluation of *AUX_MU_LSR_REG to before the loop.
She who travels light — forgot something.
Please note that my name doesn't start with the @ character so can people please stop writing it as if it does!

cleverca22
Posts: 1334
Joined: Sat Aug 18, 2012 2:33 pm

Re: Optimization (and UART access)

Fri Sep 18, 2020 1:37 am

Paeryn wrote:
Fri Sep 18, 2020 12:52 am
There's no reason for ra to be volatile, it might not make any difference. The type of AUX_MU_LSR_REG however needs to be a pointer to a volatile int.

You need the compiler to not optimize the reading of the hardware register to outside the loop, just making ra volatile doesn't guarantee that the compiler won't just move the evaluation of *AUX_MU_LSR_REG to before the loop.
yep, that sounds like the most likely cause of the problem

for reference, here is some code ive been using without issues:

Code: Select all

#define VC4_PERIPH_BASE 0x7E000000
#define ARM_PERIPH_BASE 0x3F000000
#define VC4_TO_ARM_PERIPH(addr) ((addr - VC4_PERIPH_BASE) + ARM_PERIPH_BASE)
#    define HW_REGISTER_RW(addr) (*(volatile uint32_t *)(VC4_TO_ARM_PERIPH(addr)))
#define UART_RBRTHRDLL                                           HW_REGISTER_RW( 0x7e201000 )
#define UART_MSR                                                 HW_REGISTER_RW( 0x7e201018 )

static void pl011_putchar(unsigned char c) {                                                                                               
  while(UART_MSR & 0x20);                                                                                                                  
  UART_RBRTHRDLL = c;                                                                                                                      
}                                                                                                                                          
note however, that this is for the PL011 uart, not the AUX uart, so the addresses would have to be adapted

TheGrumps
Posts: 41
Joined: Thu Sep 10, 2020 10:10 pm

Re: Optimization (and UART access)

Fri Sep 18, 2020 9:13 am

Thanks for the replies.

Indeed, making ra volatile was not the solution (I had actually tried that).
But making AUX_MU_LSR_REG a pointer to a volatile int was the cure.

After reading a PL011 UART doc, it seems the operation of that bit 5 flag is opposite between these two UARTs.
The AUX one I'm using has bit 5 set when you can write more data, whereas (I believe) the PL011 bit is set when the TX FIFO is full.

So this is what I have now:

Code: Select all

void uart_tx(unsigned char ch)
{
  while(!(*AUX_MU_LSR_REG & 0x20));
  *AUX_MU_IO_REG=ch;               
}
Of course the next question I must ask myself is why am I using the mini UART and not the PL011?!

cleverca22
Posts: 1334
Joined: Sat Aug 18, 2012 2:33 pm

Re: Optimization (and UART access)

Fri Sep 18, 2020 10:36 pm

TheGrumps wrote:
Fri Sep 18, 2020 9:13 am
Of course the next question I must ask myself is why am I using the mini UART and not the PL011?!
the AUX/mini uart baud rate is tied to the VPU clock speed, which normally goes up&down with VPU load
if you enable the aux uart in config.txt, it also disables the VPU freq scaling, which may lead to higher power usage

the PL011 has its own independent clock and also better flow control and a larger FIFO

the PL011 is better in pretty much every way, which is why its normally reserved for the bluetooth controller!

`dtoverlay=miniuart-bt` will swap things, so then bluetooth gets the aux uart, and gpio14/15 get the PL011 uart, but you still have the elevated power usage due to aux being in use

`dtoverlay=disable-bt` + `enable_uart=1` will map PL011 to 14/15, and just disable the aux uart, allowing the VPU to underclock itself when idle

TheGrumps
Posts: 41
Joined: Thu Sep 10, 2020 10:10 pm

Re: Optimization (and UART access)

Sat Sep 19, 2020 8:09 am

Thanks very much for that explanation.
Does the VPU automatically adjust its clock after power-on, or is it something that needs to be enabled? That is, if I do nothing in addition to enabling the mini-UART and setting its baud rate, will the baud rate fluctuate?

cleverca22
Posts: 1334
Joined: Sat Aug 18, 2012 2:33 pm

Re: Optimization (and UART access)

Sat Sep 19, 2020 8:13 am

TheGrumps wrote:
Sat Sep 19, 2020 8:09 am
Thanks very much for that explanation.
Does the VPU automatically adjust its clock after power-on, or is it something that needs to be enabled? That is, if I do nothing in addition to enabling the mini-UART and setting its baud rate, will the baud rate fluctuate?
if you enable the aux uart properly with config.txt, then the firmware will turn the freq scaling off on its own

but if you didnt use `enable_uart=1` and just forcibly set the baud rate and altmode functions, you may find the baud rate randomly changing any time the VPU firmware gets busy

TheGrumps
Posts: 41
Joined: Thu Sep 10, 2020 10:10 pm

Re: Optimization (and UART access)

Sat Sep 19, 2020 8:58 am

Thanks again.
I'll use the PL011 on GPIO14/15 as it seems safer.
Although I'm not using the VPU for anything. Only outputting debug info to HDMI.

Return to “Bare metal, Assembly language”