User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Very Fast ST7789 w/o CS Pure Python Driver

Wed May 12, 2021 6:40 am

Do you have a crummy 240x240 ST7789 that doesn't have a CS pin? Well, I wrote a very simple pure python driver for it that runs at 64 Mbaud. I'm not ready to turn this into it's own repo yet, but you can play with it. It's oriented as if it was in a breadboard (cause I can't figure out how to orient it any other way :D). I think it's because I'm not writing x and y according to addresses, so no matter what I do it doesn't rotate the screen. Maybe I'm wrong, but by observation that seems to be the issue. Messing with MADCTL doesn't seem to be doing what the datasheet claims it does.

st7789_ncs.py

Code: Select all

from machine import Pin, SPI
from time import sleep_ms

_SWRST   = b'\x01'
_SLPOUT  = b'\x11'
_COLMOD  = b'\x3A'
_MADCTL  = b'\x36'
_INVON   = b'\x21'
_DISPON  = b'\x29'
_CASET   = b'\x2A'
_RASET   = b'\x2B'
_RAMWR   = b'\x2C'

_BUFF   = memoryview(bytearray(115200))
_DATCOM = const(17)
_CLOCK  = const(18)
_MOSI   = const(19)
_BACKLT = const(20)
_RESET  = const(21)
_BAUD   = const(67108864)

class ST7789(object):
    def __init__(self) -> None:
        self._spi = SPI(0, sck=Pin(_CLOCK, Pin.OUT), mosi=Pin(_MOSI, Pin.OUT)) 
        self._spi.init(baudrate=_BAUD, phase=0, polarity=1)
        self._dc  = Pin(_DATCOM, Pin.OUT)
        self._rst = Pin(_RESET, Pin.OUT)
        Pin(_BACKLT, Pin.OUT)(1)
        
        #setup
        self._rst(1), sleep_ms(10)
        self._rst(0), sleep_ms(10)
        self._rst(1), sleep_ms(10)

        self.command(_SWRST ) 
        self.command(_COLMOD, b'\x05')  #16 bits per pixel
        self.command(_SLPOUT) 
        self.command(_INVON )           #inversion on
        self.command(_DISPON)           #display on
        self.command(_MADCTL, b'\x00')
        self.command(_CASET , b'\x00\x00\x00\xF0')
        self.command(_RASET , b'\x00\x00\x00\xF0')
        self.command(_RAMWR )
        
    def command(self, cmd, data=None):
        self._dc(0), self._spi.write(cmd)
        if data:
            self._dc(1), self._spi.write(data)
        
    @micropython.viper
    def clear(self): #cheat like you mean it
        b, L = ptr32(_BUFF), int(len(_BUFF))//4
        for n in range(L):
            b[n] = 0x00000000
        
    def update(self):
        self.command(_RAMWR, _BUFF)

Here's some messy test script I wrote that bounces a bunch of random sized squares at random speeds around the screen. I got it up to 50 before it started really lagging. If you try my script and it doesn't work for you, I really don't know what to tell you. In all honesty, I don't know how I got it to work for me. The datasheet is 320 pages and very confusing. I'm not even positive that I'm reading the right datasheet. My display was like 4$ and there aren't any useful numbers on it.

main.py (or whatever you want to call it)

Code: Select all

from st7789_ncs import ST7789
from random import randint

s = ST7789()

@micropython.viper
def rect(x:int, y:int, w:int, h:int, c:int):
    b, L = ptr16(_BUFF), int(w*h)
    sx = int(x+(y*240))
    for i in range(L):
        b[sx+(240*(i//w))+i%w] = c


class Thing:
    def __init__(self, x:int, y:int, w:int, h:int, c:int, sx:int, sy:int):
        self.x, self.y   = x, y
        self.w, self.h   = w, h
        self.sx, self.sy = sx, sy
        self.c = c
        self.xr = range(240-w+1)
        self.yr = range(240-h+1)
        
    def update(self):
        self.sx = self.sx if self.sx+self.x in self.xr else -self.sx
        self.sy = self.sy if self.sy+self.y in self.yr else -self.sy
        self.x += self.sx
        self.y += self.sy
        rect(self.x, self.y, self.w, self.h, self.c)
 
        
def make_things(cnt:int = 5):
    things = [0]*cnt
    for n in range(cnt):
        a = randint(20, 40)
        things[n] = Thing(randint(0, 240-a), randint(0, 240-a), a, a, randint(0xF000, 0xFFFF), randint(4, 8), randint(4, 8))
    return things      
        
things = make_things(30)

while True:
    s.clear()
    
    for t in things:
        t.update()
    
    s.update()

At one point I had a lot more going on with the driver. I was using all of the below. Then I just kept deleting things til it broke. I whittled it down to the few commands you see in the top script. I was basically on my way to implementing the entire datasheet or at least as much of it as I could with the pins that I have. This was also before I merged the `data` and `command` methods into 1.

Code: Select all

self.command(_PORCTRL)              #porch control
self.data(b'\x0C\x0C\x00\x33\x33') #power-on / sw reset/ hw reset sequence

self.command(_GCTRL) 
self.data(b'\x35')  

self.command(_VCOMS)
self.data(b'\x37')

self.command(_LCMCTRL)
self.data(b'\x2C')

self.command(_VDVVRHEN)
self.data(b'\x01')

self.command(_VRHS)
self.data(b'\x12')   

self.command(_VDVS)
self.data(b'\x20')  

self.command(_FRCTRL2) 
self.data(b'\x0F')    

self.command(_PWRCTRL1) 
self.data(b'\xA4\xA1')

self.command(_GMCTRP1)  #positive voltage gamma control
self.data(b'\xD0\x04\x0D\x11\x13\x2B\x3F\x54\x4C\x18\x0D\x0B\x1F\x23')

self.command(_GMCTRN1)  #negative voltage gamma control
self.data(b'\xD0\x04\x0C\x11\x13\x2C\x3F\x44\x51\x2F\x1F\x1F\x20\x23')
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

horuable
Posts: 121
Joined: Sat Mar 06, 2021 12:35 am

Re: Very Fast ST7789 w/o CS Pure Python Driver

Wed May 12, 2021 11:03 am

Looks nice, but why use constants for pins and create spi object inside the class? Just let users instantiate whatever spi/pins they wish and pass them as parameters. That'll give you more flexibility in connecting the display to Pico.
Also due to how SPI clock divisor works the actual frequency is 62.5 MHz and it's the highest it'll go.

One more thing to consider is that not all screens will work with SPI clocks this high, depending on the actual hardware. I know for a fact that my 320x240 LCD has some problems above 31.25 MHz. Maybe forcing baudrate is also not a good idea?

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Wed May 12, 2021 7:44 pm

@horuable

Thanks for your insight and interest in my little script. The answer to every question you asked is:

Because it's a completely bare-bones, unfinished draft. It's not meant to be production code in any way. It isn't sloe as gin (heh) even though it's just python, and I thought other people might want to play with it while it's still very simple. I also thought that I created a couple of interesting examples of how viper can be used to manipulate a buffer very efficiently, and I wanted to share those, as well. I'm pretty sure that it's those viper functions that are keeping the screen running smoothly, and it actually has nothing to do with the rest of the driver. As far as the constants go, this is actually all one script on my pico. I just split it up for my post so it would be obvious which part is the display and which part is testing. I'll give it a proper interface before any official release. I'm still tempted to write out the entire documentation, learn every iota, and possibly make it "the one ring to rule them all", so to speak.

You said that due to clock division the spi is actually running at 62.5 mhz. I have another pico that is overclocked and the script does not run any faster even if I increase the baud. Is this because the screen is maxed out and at that point the spi speed is irrelevant? If so, is this something I should cap? Or to be clearer, is "overdriving" the display something that can potentially break it?

Can you explain why MADCTL doesn't seem to do anything? Was my first assumption correct or is there something I am completely missing? Thank you for your time and knowledge.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

horuable
Posts: 121
Joined: Sat Mar 06, 2021 12:35 am

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 3:59 pm

OneMadGypsy wrote: You said that due to clock division the spi is actually running at 62.5 mhz. I have another pico that is overclocked and the script does not run any faster even if I increase the baud. Is this because the screen is maxed out and at that point the spi speed is irrelevant? If so, is this something I should cap? Or to be clearer, is "overdriving" the display something that can potentially break it?
Overclocking the Pico is actually detrimental to the SPI speed. After changing Pico's clock speed using machine.freq() the SPI clock is set to 24 MHz, and it doesn't matter if the Pico is overclocked or underclocked. You can see it when printing out the spi object, and I've also confirmed it using an oscilloscope. What's even more interesting issuing machine.freq(125_000_000) on a Pico that wasn't previously over- or underclocked still locks the SPI clock to 24 MHz.
ST7789 can work with SPI up to 62.5 MHz for writes and 6.(6) MHz for reads, so as long as you're only sending data to the screen you will not be able to "overdrive" it with Pico. Even if you'd somehow manage it, the worst thing that can happen is the screen won't be able to keep up with the data being sent and it'd just display garbage or simply not show anything at all. If it happened while modifying registers, ST7789 might end up in some weird state, but it's nothing a simple power cycle cannot solve. All in all, you should avoid using higher SPI speed than the device can handle.
As for my screen, I suspect the culprit is the level shifter present on the LCD module that is simply not able to handle such high frequencies. And that's kinda the point of my comment. There are so many different modules using ST7789 that assuming every single one of them will work at the absolute highest speed is not a good idea. I think giving the user ability to adjust it is the way to go.
OneMadGypsy wrote: Can you explain why MADCTL doesn't seem to do anything?
I'm afraid I can not because for me it's working just fine. I can rotate the screen and/or flip it by changing MY, MX, and MV bits (haven't tried changing the rest). I'm not using your driver though, but the one I've "written" (mostly copy-pasted) some time ago.

PS. I've run your script on my screen and can assure you changing MADCTL works as expected. Your demo is just not really well suited to show it since it's random and you simply don't notice any change. Try displaying some asymmetrical image, then fiddling with this register and you should see the change.

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 5:08 pm

Thank you for all of this info. Regarding using a different test for MADCTL: Of course I didn't use 30 random bouncing squares to test it :D. I used the most simple test of all. I put a square at 0,0 and adjusted MADCTL. It should have put the square in a different corner ... but it didn't. At best I got it to turn the square into 2 triangles at opposite ends of the display, which isn't correct, at all.

EDIT: I can confirm what you said about SPI speed, except on my overclocked Pico it's only 12mhz. Well, I know what I'm doing today. I'm going to fix SPI in firmware.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 6:15 pm

Well, here is part 1 of the puzzle. I'm going to hack this function to print out all the values at every step so I can peek into how value x affects value y (so to speak) with actual numbers, as opposed to alluded ones. I will figure out how to get SPI back up to 62.5 Mhz. I saw a post yesterday on a different site that was doing some interesting things with clocks. I think I need to revisit that post, as well.

rp2_common/hardware_spi/spi.c

Code: Select all

uint spi_set_baudrate(spi_inst_t *spi, uint baudrate) {
    uint freq_in = clock_get_hz(clk_peri);
    uint prescale, postdiv;
    invalid_params_if(SPI, baudrate > freq_in);

    // Find smallest prescale value which puts output frequency in range of
    // post-divide. Prescale is an even number from 2 to 254 inclusive.
    for (prescale = 2; prescale <= 254; prescale += 2) {
        if (freq_in < (prescale + 2) * 256 * (uint64_t) baudrate)
            break;
    }
    invalid_params_if(SPI, prescale > 254); // Frequency too low

    // Find largest post-divide which makes output <= baudrate. Post-divide is
    // an integer in the range 1 to 256 inclusive.
    for (postdiv = 256; postdiv > 1; --postdiv) {
        if (freq_in / (prescale * (postdiv - 1)) > baudrate)
            break;
    }

    spi_get_hw(spi)->cpsr = prescale;
    hw_write_masked(&spi_get_hw(spi)->cr0, (postdiv - 1) << SPI_SSPCR0_SCR_LSB, SPI_SSPCR0_SCR_BITS);

    // Return the frequency we were able to achieve
    return freq_in / (prescale * postdiv);
}


I think below is specifically the "problem". It is looking for the largest possible number and then using that number for this return freq_in / (prescale * postdiv). I'm no mathemagician, but I would assume we want freq_in divided by the smallest number we can get, not the the largest one. I'm sure there are other things to consider that I am not aware of yet. I refuse to believe that this is correct though. If calling machine.freq with the default clock speed reduces SPI baudrate, something is definitely not right. I'm going to reverse this loop and just see what happens. I see they have this going on hw_write_masked(&spi_get_hw(spi)->cr0, (postdiv - 1) << SPI_SSPCR0_SCR_LSB, SPI_SSPCR0_SCR_BITS); so, I'm sure changing this postdiv is going to send me on a new search. Whatever, breaking code is free.

Code: Select all

// Find largest post-divide which makes output <= baudrate. Post-divide is
    // an integer in the range 1 to 256 inclusive.
    for (postdiv = 256; postdiv > 1; --postdiv) {
        if (freq_in / (prescale * (postdiv - 1)) > baudrate)
            break;
    }
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 6:56 pm

Well, it sort of worked. Where I was getting 12mhz before, I'm now getting 24mhz. However, calling machine.freq(125_000_000) did not bring it back up to 62.5 mhz. I'm kind of stuck at this point There isn't really anything left in the chain to search. The only other thing I can think of is fundamentally changing the entire logic for that method, but I don't know what to change it to.

I'm going to print this clock_get_hz(clk_peri). Maybe this number is wrong. I mean I doubt it, but I wont know if I don't check.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

horuable
Posts: 121
Joined: Sat Mar 06, 2021 12:35 am

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 7:02 pm

I have a sneaking suspicion that calling machine.freq() to change system clock is also changing peripheral clock source to USB oscillator, hence max SPI frequency is 24 MHz. I cannot check it since I don't really have the time now, nor do I know where to look.

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 9:34 pm

I made a new module called Trouble (ie troubleshoot). In it I made the below method.


I know ~ why did I make this a method if I'm not even using self. Well, putting it directly in the module kept telling me it can't call an int. IDKWTF is up with that, so I linked it to a class so it would stop lying to me and I can stop playing "figure it out".

Code: Select all

STATIC mp_obj_t Trouble_clocks(mp_obj_t self_in) {
    (void)self_in;

    uint f = 0;
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_PLL_SYS_CLKSRC_PRIMARY);
    mp_printf(MP_PYTHON_PRINTER, "pllsys: %d\n", f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_PLL_USB_CLKSRC_PRIMARY);
    mp_printf(MP_PYTHON_PRINTER, "pllusb: %d\n", f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_ROSC_CLKSRC);
    mp_printf(MP_PYTHON_PRINTER, "rosc: %d\n"  , f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_SYS);
    mp_printf(MP_PYTHON_PRINTER, "sys: %d\n"   , f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_PERI);
    mp_printf(MP_PYTHON_PRINTER, "peri: %d\n"  , f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_USB);
    mp_printf(MP_PYTHON_PRINTER, "usb: %d\n"   , f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_ADC);
    mp_printf(MP_PYTHON_PRINTER, "adc: %d\n"   , f);
    f = frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_RTC);
    mp_printf(MP_PYTHON_PRINTER, "rtc: %d\n"   , f);
    
    return mp_const_none;
}

I then ran this simple python script:

Code: Select all

from trouble import Trouble
from machine import SPI, Pin, freq

t = Trouble()

spi = SPI(1, sck=Pin(10, Pin.OUT), mosi=Pin(11, Pin.OUT), miso=Pin(8, Pin.OUT))
spi.init(baudrate=62500000)

print("before frequency change")
t.clocks()
print(spi)
freq(125_000_000)

print("\nafter frequency change")
t.clocks()
print(spi)

The results were strange, indeed. Let's say you hardware reset the Pico first (ie GND to RUN) then these are the results:

Code: Select all

before frequency change
pllsys: 125000
pllusb: 48000
rosc: 5004
sys: 125000
peri: 125000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=62500000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

after frequency change
pllsys: 125000
pllusb: 48000
rosc: 5006
sys: 125000
peri: 48000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=62500000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

But if I just close the console and run it again without resetting the Pico. I get these results:

Code: Select all

before frequency change
pllsys: 125000
pllusb: 48000
rosc: 5006
sys: 125000
peri: 48000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=24000000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

after frequency change
pllsys: 125000
pllusb: 48000
rosc: 5007
sys: 125000
peri: 48000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=24000000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

Note that my firmware still has the reversed postdiv loop. I will put it back and see what happens. I actually expect to get worse results. Before I flipped that loop my SPI baudrate after overclock was 12mhz. I'll be back in a minute with results from unmodified firmware.
Last edited by OneMadGypsy on Thu May 13, 2021 10:17 pm, edited 1 time in total.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 9:51 pm

Ok, I put the firmware back how it was, completely deleted my build folder and rebuilt the firmware. The results were identical to the above post. This made me realize something. My pico was not overclocked via machine.freq. I was overclocking it in main.c right before it setup the uart. That being said: I cannot say with any confidence that I ever got 12mhz from machine.freq. I also can't say with any confidence that reversing postdiv did anything about my original 12mhz, because I stripped out my method and started using machine.freq before testing. I can say with confidence that reversing postdiv doesn't seem to do anything, at all. My current results are identical to the ones in my last post.

We do have one piece of information though. If you run your pico from a hard reset it does not change the SPI baud. It is only when you change the clock speed and then reinit the pico without reset that it is changed. Here is my proof ~ even doubling the clock.

Code: Select all

before frequency change
pllsys: 125000
pllusb: 48000
rosc: 5005
sys: 125000
peri: 125000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=62500000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

after frequency change
pllsys: 250000
pllusb: 48000
rosc: 4995
sys: 250000
peri: 48000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=62500000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

We also know that the spi baud would have a fractional divisor to achieve 24mhz. This is not possible so, the number it is comparing itself against cannot be the pll. With this information we can figure this out. Apparently this is a boot issue. When we set the clock to a new speed and then reinit the pico some math somewhere is getting jacked up. I will find this.


Aside:
For anyone that knows what they are doing and wants to use my clocks() method. here is the entire trouble.c. It's just some junk to get the job done, for now. I fixed the int cannot be called error and got it on a module level. I don't understand why it's not throwing the error anymore, cause this is the same thing I already had which was throwing the error. Maybe it just needed a fresh build, but I'm pretty sure I had tried that.

Code: Select all

#include "py/runtime.h"
#include "py/obj.h"
#include "py/objstr.h"
#include "hardware/clocks.h"


STATIC mp_obj_t trouble_clocks(void) {
    mp_printf(MP_PYTHON_PRINTER, "pllsys: %d\n", frequency_count_khz(CLOCKS_FC0_SRC_VALUE_PLL_SYS_CLKSRC_PRIMARY));
    mp_printf(MP_PYTHON_PRINTER, "pllusb: %d\n", frequency_count_khz(CLOCKS_FC0_SRC_VALUE_PLL_USB_CLKSRC_PRIMARY));
    mp_printf(MP_PYTHON_PRINTER, "rosc: %d\n"  , frequency_count_khz(CLOCKS_FC0_SRC_VALUE_ROSC_CLKSRC));
    mp_printf(MP_PYTHON_PRINTER, "sys: %d\n"   , frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_SYS));
    mp_printf(MP_PYTHON_PRINTER, "peri: %d\n"  , frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_PERI));
    mp_printf(MP_PYTHON_PRINTER, "usb: %d\n"   , frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_USB));
    mp_printf(MP_PYTHON_PRINTER, "adc: %d\n"   , frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_ADC));
    mp_printf(MP_PYTHON_PRINTER, "rtc: %d\n"   , frequency_count_khz(CLOCKS_FC0_SRC_VALUE_CLK_RTC));
    return mp_const_none;
}

MP_DEFINE_CONST_FUN_OBJ_0(trouble_clocks_obj, trouble_clocks);

//__> MODULE

STATIC const mp_map_elem_t trouble_globals_table[] = {
    { MP_OBJ_NEW_QSTR(MP_QSTR___name__), MP_OBJ_NEW_QSTR(MP_QSTR_trouble) },
    { MP_OBJ_NEW_QSTR(MP_QSTR_clocks), (mp_obj_t)&trouble_clocks_obj},
};

STATIC MP_DEFINE_CONST_DICT (mp_module_trouble_globals, trouble_globals_table);

const mp_obj_module_t trouble_user_cmodule = {
    .base    = { &mp_type_module },
    .globals = (mp_obj_dict_t*)&mp_module_trouble_globals,
};

MP_REGISTER_MODULE(MP_QSTR_trouble, trouble_user_cmodule, MODULE_TROUBLE_ENABLED);
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

horuable
Posts: 121
Joined: Sat Mar 06, 2021 12:35 am

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 10:52 pm

Good work! All of your findings convince me that my hunch was right. Notice that only after hardware reset the peri clock is shown to be 125 MHz, in any other case it is 48 MHz which indicates it's actually taken from the pllusb rather than pllsys.
OneMadGypsy wrote: We do have one piece of information though. If you run your pico from a hard reset it does not change the SPI baud. It is only when you change the clock speed and then reinit the pico without reset that it is changed.
Not really. It does mean that, after changing the clock, baudrate reported by printing SPI object doesn't get updated, but actual SPI clock frequency still goes to 24 MHz. I've checked it with scope, and even though SPI tells me it runs at 62.5 MHz the trace shows only 24 MHz. Reinitiating Pico probably just updates the SPI object with correct value.

UPDATE:
After a bit of peeking around the CLK_PERI_CTRL I can confirm that using machine.freq() to change the system clock sets the peripheral clock source to clksrc_pll_usb. Using the same register I was able to change it back to clk_sys and it was all that was needed to run SPI with half the overclocked Pico clock, though with SPI running at 125 MHz clock output looks horrible, and is probably totally unusable.

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Thu May 13, 2021 11:31 pm

horuable wrote:
Thu May 13, 2021 10:52 pm
...though with SPI running at 125 MHz clock output looks horrible, and is probably totally unusable.
Right, but that's an easy fix. The goal wasn't to run SPI at half the system clock. The goal was to be able to run the SPI at 62.5mhz regardless of the system clock ~ especially if the system clock is the default clock speed. All we need now is a proper divisor, and voila we have full speed SPI, capability no matter what.

It's probably something as simple as ceil(pllclk/62500000) to get the divisor.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 12:18 am

It's not elegant, at all, but my current solution, which works, is the below. I intend to think about this more and create a real solution

pico-sdk/rp2_common/hardware_spi/spi.c

Code: Select all

//this probably exists in math.h or something. I just wrote it out so I would definitely only have to build once if it wasn't
#define min(a,b) \
   ({ __typeof__ (a) _a = (a); \
       __typeof__ (b) _b = (b); \
     _a < _b ? _a : _b; })

uint spi_set_baudrate(spi_inst_t *spi, uint baudrate) {
    uint freq_in = clock_get_hz(clk_sys); //used to be clk_peri
    baudrate = min(baudrate, 62500000);
    //...
}
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 12:32 am

Actually, I found the real answer (I'm pretty sure). I noticed that there was a bunch of "2" action going on here. And "2" is the number that you divide things by to make them half (I really did look at this that rudimentary).

Code: Select all

    // Find smallest prescale value which puts output frequency in range of
    // post-divide. Prescale is an even number from 2 to 254 inclusive.
    for (prescale = 2; prescale <= 254; prescale += 2) {
        if (freq_in < (prescale + 2) * 256 * (uint64_t) baudrate)
            break;
    }

So, I created a new number exactly how I stated 2 posts up and replaced the 2 with it. Running the below on a 250mhz clock gave me a 62.5 mhz baudrate, even though I specifically set the baudrate to 125mhz. For now, I am going to call this solved. I have a bunch of SPI related work to do and I will monitor the baudrate throughout my work to see if strange things happen.

Code: Select all

uint spi_set_baudrate(spi_inst_t *spi, uint baudrate) {
    uint freq_in = clock_get_hz(clk_sys); //used to be clk_peri
    uint prescale, postdiv;
    invalid_params_if(SPI, baudrate > freq_in);
    
    int n = ceil(freq_in/62500000);

    // Find smallest prescale value which puts output frequency in range of
    // post-divide. Prescale is an even number from 2 to 254 inclusive.
    for (prescale = n; prescale <= 254; prescale += n) {
        if (freq_in < (prescale + n) * 256 * (uint64_t) baudrate)
            break;
    }
    //...
}

Edit:
I just set sysclock to 125mhz and left baudrate at 125mhz too. The baudrate printed as 62.5mhz

Edit2:
I can also confirm that setting baudrate to values less than 62.5mhz returns the proper division of the clock. For instance: I set the clock to 250mhz and set baudrate to 34mhz. The final baudrate was 31.25mhz. Every test I do is confirming that my simple fix is right. With the clock set at 125mhz and baudrate 62.5mhz I was able to render twice as many (60) "Things" on my ST7789 without issue.

Edit 5000:
I finally found an issue, but I know how to solve it. Using a static 62500000 to obtain the divisor is wrong. Let's say you set sys clock to 180mhz and set baudrate to 62mhz. You should get 60mhz but you will actually get 45mhz. This is because using a static 62500000 only works in multiples of 62500000. A new formula needs to be created which is smarter. I'll get right on it.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 1:22 am

This is my final answer. I have changed baudrate and clk_sys to all kinds of strange numbers and this formula spits out the right answer every time. Apparently, leaving 62500000 is fine. We just have to change how prescale is determined. In this version we let prescale count by ones, and only use n in the condition. If anyone is curious how I figured this out ... I didn't "figure out" jack. I said "what if", made "what if", tested "what if" ~ and it spit out out all the numbers I wanted to see. It was all just a guess based on 30 years of guessing. I still haven't even bothered to run through this code printing real numbers to get a humanly readable idea of what exactly is happening. It's all been simple thoughts like "2 is the number you divide things by to make them half, so let's change the 2" :D In the case of this last revision the simple thought was "if prescale starts at 3 and n is 3 that's 6, in the condition, but prescale at 1 and 2 have been skipped. Let's not skip those." :D simple-dimple thoughts. This really does seem to work perfectly, no matter what you throw at it, though.

Code: Select all

uint spi_set_baudrate(spi_inst_t *spi, uint baudrate) {
    uint freq_in = clock_get_hz(clk_sys); //used to be clk_peri
    uint prescale, postdiv;
    invalid_params_if(SPI, baudrate > freq_in);
    
    int n = ceil(freq_in/62500000);

    // Find smallest prescale value which puts output frequency in range of
    // post-divide. Prescale is an even number from 2 to 254 inclusive.
    for (prescale = 1; prescale <= 254; prescale += 1) {
        if (freq_in < (prescale + n) * 256 * (uint64_t) baudrate)
            break;
    }
    // ...

My Current Test Numbers And Results

clk_sys = 240M, baud= 68M
result baud = 60M

clk_sys = 192M, baud= 52M
result baud = 48M

clk_sys = 192M, baud= 17M
result baud = 16M

clk_sys = 180M, baud= 62M
result baud = 60M

clk_sys = 250M, baud= 33M
result baud = 31.25M

clk_sys = 166M, baud= 42M
result baud = 41.5M

clk_sys = 250M, baud= 33M
result baud = 31.25M
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 2:29 am

Nope, there are still fringe cases where the above wont work (pshhh). I bet HermannSW has the maths. Somebody tell HermannSW to give us the maths, and quit holding out. :D I'm all over the target. I'm in the right spot messing with the right part. I just need the omni-formula that can handle every imaginable number and combination without eliminating any of the range.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 4:07 am

I think I finally have it. I put the 2's back. It seems I was more on the right track with my very first "inelegant" approach. This was really starting to get on my nerves. I printed ceil(freq_in, MAXSPI) and no-matter-what it always returned zero. So, F that ceiling code. I wrote my own. I brought min back, too. We're gonna write a whole damn math library if that's what it takes to have functions that exist and work :D. Anyway, the concept here is simple. First of all, return everything back to normal except the clk_sys change. Now that we have the proper clock and all of the original logic is back in place, let's find out what the absolute best baudrate we can get is. Once we have that number, just let the rest do it's job. If our best baudrate is greater than the requested one, then the requested one will be processed, else the best one. This certainly seems to have fixed it. However, I've said that every single time and was still able to find cases that were wrong. Unlikely cases, but existing none-the-less.

I will say that I kept those cases around and used them to test this. They aren't wrong anymore. Even if this isn't the final solution, I'm done with it. It definitely works for the most obvious cases. I'll leave it to someone far better at C, with a knowledge of where things like properly working and existing math functions are to take this any further. If someone else wants to take this on I can tell you what your litmus test is. Make a clk_sys of 180M divide by 3 to produce a 60M baudrate. That's where my solution fails. You can only get divisions by multiples of 2, and as you can see from all my above attempts starting on an odd number or trying to iterate by 1 will break some other baud/clk_sys combo.

Code: Select all

#define MAXSPI 62500000

#define ceil_div(a,b) \
   ({ __typeof__ (a) _a = (a); \
       __typeof__ (b) _b = (b); \
     ((_a + _b - 1) / _b); })

#define min(a,b) \
   ({ __typeof__ (a) _a = (a); \
       __typeof__ (b) _b = (b); \
     _a < _b ? _a : _b; })

uint spi_set_baudrate(spi_inst_t *spi, uint baudrate) {
    uint freq_in = clock_get_hz(clk_sys);           //used to be clk_peri
    uint prescale, postdiv;
    
    int d    = ceil_div(freq_in, MAXSPI);           //get divisor
    baudrate = min(baudrate, ceil_div(freq_in, d)); //requested or best whichever is lower
    
    //invalid_params_if(SPI, baudrate > freq_in);   //this is impossible now

    // Find smallest prescale value which puts output frequency in range of
    // post-divide. Prescale is an even number from 2 to 254 inclusive.
    for (prescale = 2; prescale <= 254; prescale += 2) {
        if (freq_in < (prescale + 2) * 256 * (uint64_t) baudrate)
            break;
    }
    
    invalid_params_if(SPI, prescale > 254); // Frequency too low

    // Find largest post-divide which makes output <= baudrate. Post-divide is
    // an integer in the range 1 to 256 inclusive.
    for (postdiv = 256; postdiv > 1; --postdiv) {
        if (freq_in / (prescale * (postdiv - 1)) > baudrate)
            break;
    }
    
    spi_get_hw(spi)->cpsr = prescale;
    hw_write_masked(&spi_get_hw(spi)->cr0, (postdiv - 1) << SPI_SSPCR0_SCR_LSB, SPI_SSPCR0_SCR_BITS);

    // Return the frequency we were able to achieve
    return freq_in / (prescale * postdiv);
}

To get back on topic regarding my ST7789 display: I have found that running the pico at 192M with an spi baudrate of 48M and a Timer set at 24 tick_hz gives a really clean output and decent performance. I have made sure all my "Things" equal the area of the buffer. So, every frame represents an entire buffer worth of drawing, plus the time to calculate new positions, etc. No screen tearing and very fluid movement of "Things". Turning up any of the numbers I have listed does make the screen run faster but, it has a lot of tearing, ghosting and other anomalies. I think the display can handle 62.5M fundamentally, it just can't refresh fast enough ... or somethin' IDK.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

horuable
Posts: 121
Joined: Sat Mar 06, 2021 12:35 am

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 10:10 am

I admit I'm a little confused now. Have you changed the clk_peri to clk_sys or are you just adjusting the divisor using clk_sys as the reference?

What's the output of this code on your overclocked Pico?

Code: Select all

from machine import mem32

print(hex((mem32[0x40008000 + 0x48] >> 5) & 7))

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 5:22 pm

Code: Select all

uint freq_in = clock_get_hz(clk_sys); //was clk_peri
I'm just specifically telling spi to use clk_sys. I haven't changed anything deeper. I'm on my phone so posting your request isn't possible, but it would be the usb values still. I don't want to make too many changes to pico-sdk, and especially not fundamental ones. Specifically changing spi baudrate logic shouldn't affect the rest of the SDK. If they have some error where clk_peri should always be clk_sys I'll leave it to the developers to cascade the change throughout the sdk. My only goal was to allow the greatest possible realistic baudrate with the least possible clk_sys divisions, regardless of clk_sys speed. I mostly accomplished that. The only exception is if the least possible divisions is an odd number. In which case the divisor will keep going up to the first even number that qualifies. Case in point: at clock 180M a 60M(180/3) baudrate is legit but it will be skipped and 45M(180/4) will be used.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

horuable
Posts: 121
Joined: Sat Mar 06, 2021 12:35 am

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 6:39 pm

If the clk_peri is set to clksrc_pll_usb then the clock that SPI is running from is still 48 MHz and you'll get only (even) divisions of it.
clock_get_hz(clk_sys) is just reporting the current cls_sys value that will obviously always be equal to the system clock, overclocked or not. But it doesn't change the actual clock source, so the returned frequency, and by proxy, the one shown by SPI object is just wrong, because the clock used for divider calculation is cls_sys, while the clock that really gets divided is clk_peri that in this case is equal to clksrc_pll_usb.

From what I've seen so far the baudrate reported by SPI object can be wrong once you start fiddling with clock speeds and/or sources, and the only valid way of telling what's actually going on is to hook up an oscilloscope or logic analyzer and see what's really going on.

In your example when requesting 60 MHz baudrate from 180 MHz clock you'll get the divider of 4 for sure, but the clock speed will not be the expected 45 MHz, but 12 MHz (48/4) since that divider will be applied to clk_peri.

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 7:57 pm

OK, I have a scope. I can do the tests.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 8:12 pm

Yep. I can confirm what you said. 11.97+mhz. OK, well, back to the drawing board. Time to go find where this clock is being set. However, all is not lost. We still would have needed the divisor code to keep the spi speed manageable regardless of user requests. That part is done.

Image
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 9:21 pm

OK, I think I got it. I made the following change to pico-sdk/src/rp2_common/pico_stdlib/stdlib.c - set_sys_clk_pll - line 66

line 3 of this code excerpt

Code: Select all

    clock_configure(clk_peri,
        0, // Only AUX mux on ADC
        CLOCKS_CLK_SYS_CTRL_AUXSRC_VALUE_CLKSRC_PLL_SYS, //CLOCKS_CLK_PERI_CTRL_AUXSRC_VALUE_CLKSRC_PLL_USB,
        48 * MHZ,
        48 * MHZ);

Then from the python end I ran this code

Code: Select all

from trouble import clocks
from machine import SPI, Pin, freq

spi = SPI(1, sck=Pin(10, Pin.OUT), mosi=Pin(11, Pin.OUT), miso=Pin(8, Pin.OUT))
spi.init(baudrate=64000000)

print("before frequency change")
clocks()
print(spi)

freq(250000000)
print("\nafter frequency change")
clocks()
print(spi)

This was my result. Notice that peri is now reporting system clock numbers.

Code: Select all

before frequency change
pllsys: 125001
pllusb: 48000
rosc: 5062
sys: 125000
peri: 125001
usb: 48001
adc: 48001
rtc: 46
SPI(1, baudrate=62500000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

after frequency change
pllsys: 250000
pllusb: 48000
rosc: 5049
sys: 250000
peri: 250000
usb: 48000
adc: 48000
rtc: 47
SPI(1, baudrate=62500000, polarity=0, phase=0, bits=8, sck=10, mosi=11, miso=8)

I'm going to put it back on the scope in a minute for confirmation.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 10:28 pm

It doesn't seem like that's enough. I printed clk_peri in spi_set_baudrate and it's still 48M.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

User avatar
OneMadGypsy
Posts: 267
Joined: Wed Apr 28, 2021 1:57 am
Location: New Orleans, Louisiana
Contact: Website

Re: Very Fast ST7789 w/o CS Pure Python Driver

Fri May 14, 2021 11:19 pm

OK, this is a lot, but I think I finally tracked down the entire chain. Let's start with spi. spi_set_baudrate calls clock_get_hz(clk_peri), if we go look at clock_get_hz we see that the value actually comes from configured_freq[clk_index], where in this case clk_index equals clk_peri. Up to this point we have just established where these values are stored.

From there if we go to set_sys_clock_pll we can see that it calls clock_configure, now if we really look at that function, all the way at the end it says configured_freq[clk_index] = freq;, which is great. The only problem is, freq is the last argument of clock_configure. If we go back to set_sys_clock_pll we see that when it sets clk_peri, the last argument is 48 * MHZ. So to sum up where we are so far based on the below snippet ~ yes, setting the 3rd argument will set clk_peri to the clk_sys but it will not configure the clk_peri to the clk_sys.

Code: Select all

clock_configure(clk_peri,
     0, // Only AUX mux on ADC
     CLOCKS_CLK_SYS_CTRL_AUXSRC_VALUE_CLKSRC_PLL_SYS, //CLOCKS_CLK_PERI_CTRL_AUXSRC_VALUE_CLKSRC_PLL_USB,
     48 * MHZ,
     48 * MHZ);

Based on all of this information, I propose that the solution is actually

Code: Select all

clock_configure(clk_peri,
     0, // Only AUX mux on ADC
     CLOCKS_CLK_SYS_CTRL_AUXSRC_VALUE_CLKSRC_PLL_SYS, 
     freq, freq);

My proposition is supported by clocks_init

Code: Select all


    // CLK PERI = clk_sys. Used as reference clock for Peripherals. No dividers so just select and enable
    // Normally choose clk_sys or clk_usb
    clock_configure(clk_peri,
                    0,
                    CLOCKS_CLK_PERI_CTRL_AUXSRC_VALUE_CLK_SYS,
                    125 * MHZ,
                    125 * MHZ);

Maybe my proposition could be slightly altered to use the constants that are used by clock_init. In my example I am reusing the constants that were already available in set_sys_clock_pll ~ ie. the function the change is being made in. Big thanks to horuable. If it wasn't for the information they provided, I would have just assumed this was already fixed due to what spi was reporting.
Last edited by OneMadGypsy on Sat May 15, 2021 6:13 am, edited 1 time in total.
"Focus is a matter of deciding what things you're not going to do." ~ John Carmack

Return to “MicroPython”