carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks SPI/DMA

Tue Mar 16, 2021 7:15 pm

kilograham wrote:
Tue Mar 16, 2021 6:59 pm
...
Yes something needs to be done for sure, and this would work.

What we do in general (when using the hw divider) is to always read the results in a defined order. I think this is mentioned in both the data sheet and the C/C++ book. IRQ save/restore of divider state takes advantage of this to only save/restore if the interrupted task is "mid division", i.e. it hasn't read the results of the last divide (according to the fact that the dirty flag on the hw divider is cleared when you read the quoitent). you can look (though it isn't well commented) at pico_divider/divider.S
I wonder how Fuzix solves this? I did some quick poking around in the repository, but I'm not seeing an isr_pendsv handler. I'd like to borrow some code!

kilograham
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 608
Joined: Fri Apr 12, 2019 11:00 am
Location: austin tx

Re: Context switching breaks SPI/DMA

Tue Mar 16, 2021 10:25 pm

its possible they haven't noticed, or maybe their build (which I think might be Make?) accidentally dropped the hardware division support.

are you planning to just fix in a local FreeRTOS clone for now?

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 12:04 am

kilograham wrote:
Tue Mar 16, 2021 10:25 pm
its possible they haven't noticed, or maybe their build (which I think might be Make?) accidentally dropped the hardware division support.
I really know nothing about Fuzix, I've just seen it here in these forums. Maybe it isn't preemptive?
kilograham wrote:
Tue Mar 16, 2021 10:25 pm
are you planning to just fix in a local FreeRTOS clone for now?
As far as I know, there is no official port, although there is an ARM_CM0 port. (This is the only Pico port that I know of: https://github.com/PicoCPP/RPI-pico-FreeRTOS.) So, yes.

Right now I am plodding through the PendSVHandler and divider.S. Slow going, since I don't really know ARM assembly.

incognitum
Posts: 752
Joined: Tue Oct 30, 2018 3:34 pm

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 1:57 am

cleverca22 wrote:
Tue Mar 16, 2021 3:25 pm
i think the problem, is that your context switch algo isnt saving/restoring the state of the hardware divider
Or is the problem rather that the sdk allows context switches to happen while divider is being used?
Think other chips disable all interrupts during. E.g. https://github.com/avrxml/asf/blob/mast ... as/divas.c

cleverca22
Posts: 3741
Joined: Sat Aug 18, 2012 2:33 pm

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 2:35 am

incognitum wrote:
Wed Mar 17, 2021 1:57 am
cleverca22 wrote:
Tue Mar 16, 2021 3:25 pm
i think the problem, is that your context switch algo isnt saving/restoring the state of the hardware divider
Or is the problem rather that the sdk allows context switches to happen while divider is being used?
Think other chips disable all interrupts during. E.g. https://github.com/avrxml/asf/blob/mast ... as/divas.c
that would cause the irq entry to be delayed by up to 8 clocks, possibly less

which then makes your irq's unpredictable, screwing with timing of things

its more predictable (but slower overall), to let division be interrupted, and save/restore, then it takes the exact same amount of time every time

ejolson
Posts: 7232
Joined: Tue Mar 18, 2014 11:47 am

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 2:44 am

cleverca22 wrote:
Wed Mar 17, 2021 2:35 am
incognitum wrote:
Wed Mar 17, 2021 1:57 am
cleverca22 wrote:
Tue Mar 16, 2021 3:25 pm
i think the problem, is that your context switch algo isnt saving/restoring the state of the hardware divider
Or is the problem rather that the sdk allows context switches to happen while divider is being used?
Think other chips disable all interrupts during. E.g. https://github.com/avrxml/asf/blob/mast ... as/divas.c
that would cause the irq entry to be delayed by up to 8 clocks, possibly less

which then makes your irq's unpredictable, screwing with timing of things

its more predictable (but slower overall), to let division be interrupted, and save/restore, then it takes the exact same amount of time every time
How much slower is software compared to using the hardware divider?

incognitum
Posts: 752
Joined: Tue Oct 30, 2018 3:34 pm

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 2:58 am

cleverca22 wrote:
Wed Mar 17, 2021 2:35 am
its more predictable (but slower overall), to let division be interrupted, and save/restore, then it takes the exact same amount of time every time
I read in this thread that the sdk's own irq handling is only doing save/restore when divider is marked busy though.
While clever, different code path depending on wheter divider marked busy does not sound like exact same amount of time every time to me either. Or am I misunderstanding how that works?

cleverca22
Posts: 3741
Joined: Sat Aug 18, 2012 2:33 pm

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 3:16 am

ejolson wrote:
Wed Mar 17, 2021 2:44 am
How much slower is software compared to using the hardware divider?
when i said slower, i mean the cpu cycles spent saving the divider state to ram upon entry to the irq handler
incognitum wrote:
Wed Mar 17, 2021 2:58 am
I read in this thread that the sdk's own irq handling is only doing save/restore when divider is marked busy though.
While clever, different code path depending on wheter divider marked busy does not sound like exact same amount of time every time to me either. Or am I misunderstanding how that works?
if it is doing that, then you get some jitter back, but its only going to be fast or slow, rather then spread over a whole ~8 cycle range of speeds

you can probably also tune that, and do many other things

linux and many other kernels, just bans floating point in irq handlers, so the kernel doesnt have to bother with the whole save/restore for floats during an irq
but doing similar with division is harder, and any function you call could have hidden division...

pica200
Posts: 274
Joined: Tue Aug 06, 2019 10:27 am

Re: Context switching breaks SPI/DMA

Wed Mar 17, 2021 9:02 am

cleverca22 wrote:
Wed Mar 17, 2021 3:16 am
linux and many other kernels, just bans floating point in irq handlers, so the kernel doesnt have to bother with the whole save/restore for floats during an irq
but doing similar with division is harder, and any function you call could have hidden division...
They may ban it within the kernel but they still have to deal with it on context switches. Linux and co. use a little trick to save as much time as possible. Hardware fp math gets disabled causing exceptions when a process uses fp instructions for the first time. The kernel will then enable hardware fp for that process. Otherwise it skips save/restore of fp regs.

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks division

Sat Mar 20, 2021 12:25 am

kilograham wrote:
Tue Mar 16, 2021 6:59 pm
carlk3 wrote:
Tue Mar 16, 2021 6:42 pm
cleverca22 wrote:
Tue Mar 16, 2021 3:25 pm
i think the problem, is that your context switch algo isnt saving/restoring the state of the hardware divider

kilogram's bit of code, switches you to an entirely software driven divider, which is slower, but uses regular arm regs
Makes sense. So, I need to stick hw_divider_save_state() and hw_divider_restore_state() somewhere into xPortPendSVHandler()?(FreeRTOSConfig.h does `#define xPortPendSVHandler isr_pendsv`).
Yes something needs to be done for sure, and this would work.

What we do in general (when using the hw divider) is to always read the results in a defined order. I think this is mentioned in both the data sheet and the C/C++ book. IRQ save/restore of divider state takes advantage of this to only save/restore if the interrupted task is "mid division", i.e. it hasn't read the results of the last divide (according to the fact that the dirty flag on the hw divider is cleared when you read the quoitent). you can look (though it isn't well commented) at pico_divider/divider.S
Here is what I came up with:

Code: Select all

void xPortPendSVHandler( void )
{

    /* This is a naked function. */

    // Save area:
    //psp->               |    0    
    //                    |    -4    r11
    //                    |    -8    r10
    //                    |    -12    r9
    //pxTopOfStack + 32   |    -16    r8
    //                    |    -20    r7
    //                    |    -24    r6
    //                    |    -28    r5
    //psp - 32            |    -32    r4
    //                    |    -36    SIO_DIV_QUOTIENT
    //                    |    -40    SIO_DIV_REMAINDER
    //                    |    -44    SIO_DIV_UDIVISOR
    //pxTopOfStack->      |    -48    SIO_DIV_UDIVIDEND

    __asm volatile
    (
        "    .syntax unified                        \n"
        "    mrs r0, psp                            \n"
        "                                           \n"
        "    ldr    r3, pxCurrentTCBConst           \n"/* Get the location of the current TCB. */
        "    ldr    r2, [r3]                        \n"
        "                                        \n"
        "    subs r0, r0, #32                    \n"/* Make space for the remaining low registers. */
        "    stm r0!, {r4-r7}                    \n"/* Store the low registers that are not saved automatically. */
        "     mov r4, r8                            \n"/* Store the high registers. */
        "     mov r5, r9                            \n"
        "     mov r6, r10                            \n"
        "     mov r7, r11                            \n"
        "     stm r0!, {r4-r7}                    \n"
        "                                        \n"
        "    subs r0, r0, #48                    \n"/* Make space for divider state. */
        "    str r0, [r2]                        \n"/* Save the new top of stack. */
        "                                        \n"
        /* hw_divider_save_state */
        "    ldr r2, =#0xD0000000                \n"/* SIO_BASE */
        "    ldr r1, [r2, #0x00000078]            \n"/* SIO_DIV_CSR_OFFSET (sio.h) */
        /* wait for results as we can't save signed-ness of operation */
        "MY1:                                    \n"
        "    lsrs r1, 1                            \n"/* #SIO_DIV_CSR_READY_SHIFT_FOR_CARRY */
        "    bcc MY1                                \n"
        "    ldr r4, [r2, #0x00000060]            \n"/* SIO_DIV_UDIVIDEND_OFFSET */
        "    ldr r5, [r2, #0x00000064]            \n"/* SIO_DIV_UDIVISOR_OFFSET */
        "    ldr r6, [r2, #0x00000074]            \n"/* SIO_DIV_REMAINDER_OFFSET */
        "    ldr r7, [r2, #0x00000070]            \n"/* SIO_DIV_QUOTIENT_OFFSET */
        "    stm r0!, {r4-r7}                    \n"/* Save HW divider state */
        "                                        \n"
        "    push {r3, r14}                        \n"
        "    cpsid i                                \n"
        "    bl vTaskSwitchContext                \n"
        "    cpsie i                                \n"
        "    pop {r2, r3}                        \n"/* lr goes in r3. r2 now holds tcb pointer. */
        "                                        \n"
        "    ldr r1, [r2]                        \n"
        "    ldr r0, [r1]                        \n"/* The first item in pxCurrentTCB is the task top of stack. */
        "                                        \n"
        /* hw_divider_restore_state */
        "    ldr r2, =#0xD0000000                \n"/* SIO_BASE */
        "    ldm r0!, {r4-r7}                    \n"
        "    str r4, [r2, #0x00000060]            \n"/* SIO_DIV_UDIVIDEND_OFFSET */
        "    str r5, [r2, #0x00000064]            \n"/* SIO_DIV_UDIVISOR_OFFSET */
        "    str r6, [r2, #0x00000074]            \n"/* SIO_DIV_REMAINDER_OFFSET */
        "    str r7, [r2, #0x00000070]            \n"/* SIO_DIV_QUOTIENT_OFFSET */
        "                                        \n"
        "    adds r0, r0, #16                    \n"/* Move to the high registers. */
        "    ldm r0!, {r4-r7}                    \n"/* Pop the high registers. */
        "     mov r8, r4                            \n"
        "     mov r9, r5                            \n"
        "     mov r10, r6                            \n"
        "     mov r11, r7                            \n"
        "                                        \n"
        "    msr psp, r0                            \n"/* Remember the new top of stack for the task. */
        "                                        \n"
        "    subs r0, r0, #32                    \n"/* Go back for the low registers that are not automatically restored. */
        "     ldm r0!, {r4-r7}                    \n"/* Pop low registers.  */
        "                                        \n"
        "    bx r3                                \n"
        "                                        \n"
        "    .align 4                            \n"
        "pxCurrentTCBConst: .word pxCurrentTCB    \n"
    );
}
/*-----------------------------------------------------------*/
My starting point was FreeRTOS-Kernel/port.c at main · FreeRTOS/FreeRTOS-Kernel · GitHub. I’ve attached my modified version.

What do you think? [I am a beginner at ARM assembly, so suggestions are welcome!]
Attachments
port.zip
(6.94 KiB) Downloaded 17 times
Last edited by carlk3 on Fri Apr 09, 2021 9:19 pm, edited 1 time in total.

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks division

Tue Mar 23, 2021 1:10 am

Does it need to save the signed input registers too?
Last edited by carlk3 on Fri Apr 09, 2021 9:19 pm, edited 1 time in total.

kilograham
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 608
Joined: Fri Apr 12, 2019 11:00 am
Location: austin tx

Re: Context switching breaks SPI/DMA

Tue Mar 23, 2021 1:53 am

Sadly the signed-ness of the divide makes it a little harder than what your are doing; i am/will be working on a PR myself... so will think thru all the edge cases then. You might want to just use the compiler divider until then (but don't let me stop you!)... Might be awesome to make a test case that checks the issue, and see if you can break your fix!

There was a question above - i haven't timed it recently but I think gcc is up to about 45 cycles for a divide on cortex m0+... the hardware divider is 8 + a few.

kilograham
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 608
Joined: Fri Apr 12, 2019 11:00 am
Location: austin tx

Re: Context switching breaks SPI/DMA

Tue Mar 23, 2021 1:58 am

ah, I checked (you had me worried for a sec) and hw_save_divider_state does saves/restores the result correctly; however I expect we can do something a bit more efficient.

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks division

Wed Mar 24, 2021 3:41 pm

The datasheet says:
To support save and restore on interrupt handler entry/exit (or on e.g. an RTOS context switch), the result registers are
also writable. Writing to a result register will cancel any operation in progress at the time. The DIV_CSR.DIRTY flag can
help make save/restore more efficient: this flag is set when any divider register (operand or result) is written to, and
cleared when the quotient is read.
That seems like a different approach to that used by hw_divider_save_state()/hw_divider_restore_state(). There is no write to a result register in those, as far as I can see. Is it a better approach for PendSV?
Last edited by carlk3 on Fri Apr 09, 2021 9:17 pm, edited 1 time in total.

kilograham
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 608
Joined: Fri Apr 12, 2019 11:00 am
Location: austin tx

Re: Context switching breaks SPI/DMA

Thu Mar 25, 2021 12:18 am

yes, hw_divider_save/restore_state are functional but not optimal. they are not actually used by the SDK, and won't be used by my soon to be FreeRTOS PR (what i was saying above is that you could copy them - or call them - for now to get you up and going in the interim)

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks division

Thu Mar 25, 2021 3:24 am

kilograham wrote:
Tue Mar 23, 2021 1:53 am
...
Might be awesome to make a test case that checks the issue, and see if you can break your fix!
...
My original test case that failed intermittently (but often) runs fine (for many hours) with the PendSV code above. Obviously not definitive, but it's not a real easy problem to write test cases for.
kilograham wrote:
Thu Mar 25, 2021 12:18 am
yes, hw_divider_save/restore_state are functional but not optimal. they are not actually used by the SDK, and won't be used by my soon to be FreeRTOS PR (what i was saying above is that you could copy them - or call them - for now to get you up and going in the interim)
OK. My PendSV version just embeds the hw_divider_save/restore_state code from pico_divider/divider.S.
Last edited by carlk3 on Fri Apr 09, 2021 9:18 pm, edited 1 time in total.

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks SPI/DMA

Sat Mar 27, 2021 9:56 pm

carlk3 wrote:
Thu Mar 25, 2021 3:24 am
kilograham wrote:
Tue Mar 23, 2021 1:53 am
...
Might be awesome to make a test case that checks the issue, and see if you can break your fix!
...
My original test case that failed intermittently (but often) runs fine (for many hours) with the PendSV code above. Obviously not definitive, but it's not a real easy problem to write test cases for.
Specifically, git@github.com:carlk3/divider_context_switch.git (https://github.com/carlk3/divider_context_switch.git). (Was spi_dma. I renamed it and spruced it up a bit.)

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks division

Thu Apr 01, 2021 6:41 pm

David Crocker asks a good question on the FreeRTOS kernel forum: Extending the xPortPendSVHandler save area for ARM_CM0 Kernel: "The divider takes only 8 cycles to perform a division, so why not just disable interrupts while it is in use? Is interrupt latency so critical in your application that you can’t afford another ~10 cycles of latency?"
Last edited by carlk3 on Fri Apr 09, 2021 9:15 pm, edited 1 time in total.

kilograham
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 608
Joined: Fri Apr 12, 2019 11:00 am
Location: austin tx

Re: Context switching breaks SPI/DMA

Thu Apr 01, 2021 6:55 pm

well then every divide takes twice as long which rather destroys the point... but yes you can protect raw use of the divider by disabling IRQs.

Note as I mentioned above, we are currently working on an RP2040 FreeRTOS "port" ourselves, so we'd rather you don't upstream yours

carlk3
Posts: 57
Joined: Wed Feb 17, 2021 8:46 pm

Re: Context switching breaks division

Fri Apr 02, 2021 2:35 am

kilograham wrote:
Thu Apr 01, 2021 6:55 pm
well then every divide takes twice as long which rather destroys the point...
I don't understand why.
kilograham wrote:
Thu Apr 01, 2021 6:55 pm
Note as I mentioned above, we are currently working on an RP2040 FreeRTOS "port" ourselves, so we'd rather you don't upstream yours
OK.
Last edited by carlk3 on Fri Apr 09, 2021 9:15 pm, edited 1 time in total.

ejolson
Posts: 7232
Joined: Tue Mar 18, 2014 11:47 am

Re: Context switching breaks SPI/DMA

Fri Apr 02, 2021 3:06 am

carlk3 wrote:
Fri Apr 02, 2021 2:35 am
kilograham wrote:
Thu Apr 01, 2021 6:55 pm
well then every divide takes twice as long which rather destroys the point...
I don't understand why.
My understanding is that there is no point in having a hardware divider if it can't be used because saving the state during a context switch takes too long.

This somehow reminds me of a project I was involved with

https://archive.org/details/bitsavers_s ... 1/mode/2up

involving a communications controller.

During development it was discovered the C compiler (Whitesmiths for the 68000) was inserting a multiplication into the interrupt handler and unacceptably slowing things down. I can't remember the details, but I think the solution was to rewrite the routine in assembler.

In the present case, it makes even less sense to have any division operations in a latency sensitive interrupt handler. How difficult would it be to change the C runtime so code compiled to use the hardware divider could be linked to code compiled for software division and things to not get mixed up?

cleverca22
Posts: 3741
Joined: Sat Aug 18, 2012 2:33 pm

Re: Context switching breaks SPI/DMA

Sat Apr 03, 2021 8:23 am

ejolson wrote:
Fri Apr 02, 2021 3:06 am
My understanding is that there is no point in having a hardware divider if it can't be used because saving the state during a context switch takes too long.
on x86, the FPU is in a similar situation
there are 2 tricks linux uses to make things faster

1: just ban all FPU use in the kernel, so you never have to save/restore during an irq, your returning to the same proc, and the state is left unchanged
2: defer the context-switch of the FPU as well!, when you change to a new thread, disable the FPU (a cpu flag), and remember what proc's state is in the FPU
upon an FPU security violation, do the FPU context switch, and re-enable it

1 is a bit tricky on the rp2040, because you lack a clear user/kernel split, and its hard to know if a function could be called from an irq context
2 relies on having an allow/disallow flag in the cpu, or wrapping every divider access with a check against a global variable to see if you have permission

lurk101
Posts: 575
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: Context switching breaks SPI/DMA

Fri May 07, 2021 11:26 pm

kilograham wrote:
Thu Apr 01, 2021 6:55 pm
Note as I mentioned above, we are currently working on an RP2040 FreeRTOS "port" ourselves, so we'd rather you don't upstream yours
How is that project coming along?
The old semiconductor paradigms are rapidly becoming a thing of the past.
Today, it's about the best transistors, architectures, and accelerators for the job.

kilograham
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 608
Joined: Fri Apr 12, 2019 11:00 am
Location: austin tx

Re: Context switching breaks SPI/DMA

Fri May 07, 2021 11:53 pm

good; tidying a few things up, but then it will be available as PR pretty son; i think you will like!

fivdi
Posts: 435
Joined: Sun Sep 23, 2012 8:09 pm
Contact: Website

Re: Context switching breaks SPI/DMA

Sun May 09, 2021 5:48 pm

kilograham wrote:
Fri May 07, 2021 11:53 pm
good; tidying a few things up, but then it will be available as PR pretty son; i think you will like!
Nice, I look forward to it.

Return to “SDK”