samsvl
Posts: 4
Joined: Mon Nov 03, 2014 2:30 pm

Streaming 6Mbyte/sec to memory

Mon Nov 03, 2014 2:37 pm

My GNSS front end outputs 6Mbyte/sec data: 8bit + 6MHz clock.
Is the RPi capable of streaming this data to memory (either RAM
or SD/ MMC) for about 20 sec without loss of data?
Anybody knows of a likewise project?

Thanks a lot,

Sam

samsvl
Posts: 4
Joined: Mon Nov 03, 2014 2:30 pm

Re: Streaming 6Mbyte/sec to memory

Wed Apr 01, 2015 7:59 pm

Will the rpi2 do this job?

rst
Posts: 482
Joined: Sat Apr 20, 2013 6:42 pm
Location: Germany

Re: Streaming 6Mbyte/sec to memory

Thu Apr 02, 2015 6:46 pm

samsvl wrote:Will the rpi2 do this job?
With a sample loop reading the GPLEV0 register and storing it to memory I reached about 11 MHz sample rate max. at 900 MHz CPU clock on the Pi 2.

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2460
Joined: Thu Jul 11, 2013 2:37 pm

Re: Streaming 6Mbyte/sec to memory

Fri Apr 03, 2015 6:06 pm

You can, but your main problem will be sampling jitter. You could hammer the GPIOLEV0 register from the ARM in a very tight loop but you need sub-microsecond accuracy. You'd have to sample all data bits including clock and just "hope" you don't miss an edge.

If you want to sample 8 bits with a source clock then consider a FT245R-based device - 8-bit plain FIFO with clocked input capability. Readout is via high-speed USB.
Rockets are loud.
https://astro-pi.org

colinh
Posts: 95
Joined: Tue Dec 03, 2013 11:59 pm
Location: Munich

Re: Streaming 6Mbyte/sec to memory

Sun Apr 05, 2015 2:15 am

I think I was getting 37.5 Mwords/sec using DMA with BL7 (from GPLEV0 to memory).

User avatar
joan
Posts: 15084
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Streaming 6Mbyte/sec to memory

Sun Apr 05, 2015 9:06 am

colinh wrote:I think I was getting 37.5 Mwords/sec using DMA with BL7 (from GPLEV0 to memory).
BL7?

colinh
Posts: 95
Joined: Tue Dec 03, 2013 11:59 pm
Location: Munich

Re: Streaming 6Mbyte/sec to memory

Sun Apr 05, 2015 6:01 pm

BL7: DMA_BURST_LEN = 7

This is something I was trying out last summer, when my JTAG interface broke. I haven't done any RPi programming since then (until now, where I'm trying out qemu).

Sooo, I don't know for certain that the following code snippets work. I might have been in the middle of changing something. My hobby OS might have other things set up. Like L1 and L2 cache.

CAVEAT EMPTOR

Code: Select all

// DMA
.set dma_base,          0x20007000
.set DMA0,              0x000           // DMA channels. Offset from dma_base
.set DMA1,              0x100
.set DMA2,              0x200
.set DMA3,              0x300
.set DMA4,              0x400
.set DMA5,              0x500
.set DMA6,              0x600
.set DMA7,              0x700
.set DMA8,              0x800
.set DMA9,              0x900
.set DMA10,             0xa00
.set DMA11,             0xb00
.set DMA12,             0xc00
.set DMA13,             0xd00
.set DMA14,             0xe00

.set DMA_INT_STATUS,    0xfe0           // bits 0-15 for DMA0-15
.set DMA_ENABLE,        0xff0           // bits 0-14 for DMA0-14

.set dma15_base,        0x20e05000
.set DMA15,             0x000           // DMA15 offset from dma15_base

.set DMA_CS,            0x0             // registers. Offset from DMAn
.set DMA_CONBLK_AD,     0x4
.set DMA_TI,            0x8
.set DMA_SOURCE_AD,     0xC
.set DMA_DEST_AD,       0x10
.set DMA_TXFR_LEN,      0x14
.set DMA_STRIDE,        0x18
.set DMA_NEXTCONBK,     0x1C
.set DMA_DEBUG,         0x20

.set DMA_RESET,                         31      // DMA_CS bits
.set DMA_ABORT,                         30
.set DMA_DISDEBUG,                      29
.set DMA_WAIT_FOR_OUTSTANDING_WRITES,   28
.set DMA_PANIC_PRIORITY,                20      // 23:20
.set DMA_PRIORITY,                      16      // 19:16
.set DMA_ERROR,                          8
.set DMA_WAITING_FOR_OUSTANDING_WRITES,  6
.set DMA_DREQ_STOPS_DMA,                 5
.set DMA_PAUSED,                         4
.set DMA_DREQ,                           3
.set DMA_INT,                            2
.set DMA_END,                            1
.set DMA_ACTIVE,                         0

.set DMA_NO_WIDE_BURSTS,                26      // DMA_TI (DMA0-6) bits
.set DMA_WAITS,                         21      // 25:21
.set DMA_PERMAP,                        16      // 20:16
.set DMA_BURST_LENGTH,                  12      // 15:12
.set DMA_SRC_IGNORE,                    11
.set DMA_SRC_DREQ,                      10
.set DMA_SRC_WIDTH,                      9
.set DMA_SRC_INC,                        8
.set DMA_DEST_IGNORE,                    7
.set DMA_DEST_DREQ,                      6
.set DMA_DEST_WIDTH,                     5
.set DMA_DEST_INC,                       4
.set DMA_WAIT_RESP,                      3
.set DMA_TDMODE,                         1
.set DMA_INTEN,                          0
                                                // DMA_TXFR_LEN bits
.set DMA_YLENGTH,                       16      // 29:16
.set DMA_XLENGTH,                        0      // 15:0

                                                // DMA_STRIDE bits
.set DMA_D_STRIDE,                      16      // 31:16
.set DMA_S_STRIDE,                       0      // 15:0

.set DMA_LITE,                          28      // DMA_DEBUG bits
.set DMA_VERSION,                       25      // 27:25
.set DMA_STATE,                         16      // 24:16
.set DMA_ID,                             8      // 15:8
.set DMA_OUTSTANDING_WRITES,             4      // 7:4
.set DMA_READ_ERROR,                     2
.set DMA_FIFO_ERROR,                     1
.set DMA_READ_LAST_NOT_SET_ERROR,        0


// System Timer Registers p.172
.set ST_base,               0x20003000
.set ST_CS,                 0x00000000
.set ST_CLO,                0x00000004
.set ST_CHI,                0x00000008
.set ST_C0,                 0x0000000C
.set ST_C1,                 0x00000010
.set ST_C2,                 0x00000014
.set ST_C3,                 0x00000018



    ldr     r12, =ST_base
    ldr     r11, [r12, #ST_CLO]         // 1st time stamp
    push    {r11}

    ldr     r0, =dma_base
    add     r0, #DMA0

    mov     r1, #1024
    ldr     r2, =dma_cbk2
    mov     r3, #1<<DMA_ACTIVE
1:
    str     r2, [r0, #DMA_CONBLK_AD]
    str     r3, [r0, #DMA_CS]
2:
    ldr     r4, [r0, #DMA_CS]
    tst     r4, #1<<DMA_ACTIVE
    bne     2b

    subs    r1, #1                      // loop 1024 of 4 Mbyte copies = 4 GByte in deltaT ~= n secs
    bne     1b                          // BL1: 0x2a2e976 ~= 45 secs = 90 MBytes/s = 22.5 Mwords/s
                                        // BL2: 0x297a33f ~= 43 secs
                                        // BL6: 0x1e049e5 ~= 30 secs
                                        // BL7: 0x18306FB ~= 27 secs = 150 MByte/s = 37.5 Mwords/s ,  0x1830e77 ~= ...
                                        // BL8: 0x403c67a ~= 63 secs
                                        // BL9: 0x403c6a8 ~= 66 secs, 0x403c6eb ~= 65 secs
    ldr     r12, =ST_base
    ldr     r11, [r12, #ST_CLO]         // 2nd time stamp
    pop     {r10}                       // the pushed r11
    sub     r11, r10                    // delta T


.align 5
dma_cbk4:
    .word   1<<DMA_DEST_INC | 7<<DMA_BURST_LENGTH       // TI = SRC_WIDTH not set ie 32 bit transfer
    .word   0x7e003004                                  // SRC = phys addr of system_timer.CLO
    .word   0x00100000                                  // DEST= save to phys addr of 1Mbyte
    .word   0x00400000                                  // LEN = transfer 4 Mbytes ie. 1 Mwords
    .word   0                                           // STRIDE
    .word   0                                           // NEXT-CB
    .word   0                                           // padding
    .word   0                                           // padding

dma_cbk2:
    .word   1<<DMA_DEST_INC | 9<<DMA_BURST_LENGTH           // SRC_WIDTH not set ie 32 bit transfer
    .word   0x7e200034                  // VC bus addr of GPLEV0 = values of GPIO pins 0-31
    .word   0x00100000
    .word   0x00400000
    .word   0
    .word   =dma_cbk2
    .word   0
    .word   0

dma_cbk3:
    .word   1<<DMA_DEST_INC | 1<<DMA_DEST_WIDTH | 1<<DMA_SRC_WIDTH | 1<<DMA_SRC_INC | 9<<DMA_BURST_LENGTH
    .word   0x00100000
    .word   0x00200000
    .word   0x00100000
    .word   0
    .word   0
    .word   0
    .word   0

dma_cbk:
    .word   1<<DMA_TDMODE | 1<<DMA_DEST_INC | 1<<DMA_DEST_WIDTH | 1<<DMA_SRC_WIDTH | 1<<DMA_SRC_INC | 9<<DMA_BURST_LENGTH
    .word   0x48006000
    .word   0x48006000 + 800*2
    .word   1024<<DMA_YLENGTH | (512 * 2)<<DMA_XLENGTH
    .word   ((1920-512)*2)<<DMA_D_STRIDE | ((1920-512)*2)<<DMA_S_STRIDE
    .word   0
    .word   0
    .word   0


samsvl
Posts: 4
Joined: Mon Nov 03, 2014 2:30 pm

Re: Streaming 6Mbyte/sec to memory

Mon Apr 13, 2015 5:36 pm

jdb wrote:You can, but your main problem will be sampling jitter. You could hammer the GPIOLEV0 register from the ARM in a very tight loop but you need sub-microsecond accuracy. You'd have to sample all data bits including clock and just "hope" you don't miss an edge.

If you want to sample 8 bits with a source clock then consider a FT245R-based device - 8-bit plain FIFO with clocked input capability. Readout is via high-speed USB.

This is what I tried earlier (FR2232H). Unfortunately with some loss of data, probably because the operating system (I tried both WinXP and Linux) requires too much time for other jobs.

samsvl
Posts: 4
Joined: Mon Nov 03, 2014 2:30 pm

Re: Streaming 6Mbyte/sec to memory

Mon Apr 13, 2015 5:38 pm

colinh wrote:BL7: DMA_BURST_LEN = 7

This is something I was trying out last summer, when my JTAG interface broke. I haven't done any RPi programming since then (until now, where I'm trying out qemu).

Sooo, I don't know for certain that the following code snippets work. I might have been in the middle of changing something. My hobby OS might have other things set up. Like L1 and L2 cache.

CAVEAT EMPTOR

Thanks for your suggestion, I will try this.

Code: Select all

// DMA
.set dma_base,          0x20007000
.set DMA0,              0x000           // DMA channels. Offset from dma_base
.set DMA1,              0x100
.set DMA2,              0x200
.set DMA3,              0x300
.set DMA4,              0x400
.set DMA5,              0x500
.set DMA6,              0x600
.set DMA7,              0x700
.set DMA8,              0x800
.set DMA9,              0x900
.set DMA10,             0xa00
.set DMA11,             0xb00
.set DMA12,             0xc00
.set DMA13,             0xd00
.set DMA14,             0xe00

.set DMA_INT_STATUS,    0xfe0           // bits 0-15 for DMA0-15
.set DMA_ENABLE,        0xff0           // bits 0-14 for DMA0-14

.set dma15_base,        0x20e05000
.set DMA15,             0x000           // DMA15 offset from dma15_base

.set DMA_CS,            0x0             // registers. Offset from DMAn
.set DMA_CONBLK_AD,     0x4
.set DMA_TI,            0x8
.set DMA_SOURCE_AD,     0xC
.set DMA_DEST_AD,       0x10
.set DMA_TXFR_LEN,      0x14
.set DMA_STRIDE,        0x18
.set DMA_NEXTCONBK,     0x1C
.set DMA_DEBUG,         0x20

.set DMA_RESET,                         31      // DMA_CS bits
.set DMA_ABORT,                         30
.set DMA_DISDEBUG,                      29
.set DMA_WAIT_FOR_OUTSTANDING_WRITES,   28
.set DMA_PANIC_PRIORITY,                20      // 23:20
.set DMA_PRIORITY,                      16      // 19:16
.set DMA_ERROR,                          8
.set DMA_WAITING_FOR_OUSTANDING_WRITES,  6
.set DMA_DREQ_STOPS_DMA,                 5
.set DMA_PAUSED,                         4
.set DMA_DREQ,                           3
.set DMA_INT,                            2
.set DMA_END,                            1
.set DMA_ACTIVE,                         0

.set DMA_NO_WIDE_BURSTS,                26      // DMA_TI (DMA0-6) bits
.set DMA_WAITS,                         21      // 25:21
.set DMA_PERMAP,                        16      // 20:16
.set DMA_BURST_LENGTH,                  12      // 15:12
.set DMA_SRC_IGNORE,                    11
.set DMA_SRC_DREQ,                      10
.set DMA_SRC_WIDTH,                      9
.set DMA_SRC_INC,                        8
.set DMA_DEST_IGNORE,                    7
.set DMA_DEST_DREQ,                      6
.set DMA_DEST_WIDTH,                     5
.set DMA_DEST_INC,                       4
.set DMA_WAIT_RESP,                      3
.set DMA_TDMODE,                         1
.set DMA_INTEN,                          0
                                                // DMA_TXFR_LEN bits
.set DMA_YLENGTH,                       16      // 29:16
.set DMA_XLENGTH,                        0      // 15:0

                                                // DMA_STRIDE bits
.set DMA_D_STRIDE,                      16      // 31:16
.set DMA_S_STRIDE,                       0      // 15:0

.set DMA_LITE,                          28      // DMA_DEBUG bits
.set DMA_VERSION,                       25      // 27:25
.set DMA_STATE,                         16      // 24:16
.set DMA_ID,                             8      // 15:8
.set DMA_OUTSTANDING_WRITES,             4      // 7:4
.set DMA_READ_ERROR,                     2
.set DMA_FIFO_ERROR,                     1
.set DMA_READ_LAST_NOT_SET_ERROR,        0


// System Timer Registers p.172
.set ST_base,               0x20003000
.set ST_CS,                 0x00000000
.set ST_CLO,                0x00000004
.set ST_CHI,                0x00000008
.set ST_C0,                 0x0000000C
.set ST_C1,                 0x00000010
.set ST_C2,                 0x00000014
.set ST_C3,                 0x00000018



    ldr     r12, =ST_base
    ldr     r11, [r12, #ST_CLO]         // 1st time stamp
    push    {r11}

    ldr     r0, =dma_base
    add     r0, #DMA0

    mov     r1, #1024
    ldr     r2, =dma_cbk2
    mov     r3, #1<<DMA_ACTIVE
1:
    str     r2, [r0, #DMA_CONBLK_AD]
    str     r3, [r0, #DMA_CS]
2:
    ldr     r4, [r0, #DMA_CS]
    tst     r4, #1<<DMA_ACTIVE
    bne     2b

    subs    r1, #1                      // loop 1024 of 4 Mbyte copies = 4 GByte in deltaT ~= n secs
    bne     1b                          // BL1: 0x2a2e976 ~= 45 secs = 90 MBytes/s = 22.5 Mwords/s
                                        // BL2: 0x297a33f ~= 43 secs
                                        // BL6: 0x1e049e5 ~= 30 secs
                                        // BL7: 0x18306FB ~= 27 secs = 150 MByte/s = 37.5 Mwords/s ,  0x1830e77 ~= ...
                                        // BL8: 0x403c67a ~= 63 secs
                                        // BL9: 0x403c6a8 ~= 66 secs, 0x403c6eb ~= 65 secs
    ldr     r12, =ST_base
    ldr     r11, [r12, #ST_CLO]         // 2nd time stamp
    pop     {r10}                       // the pushed r11
    sub     r11, r10                    // delta T


.align 5
dma_cbk4:
    .word   1<<DMA_DEST_INC | 7<<DMA_BURST_LENGTH       // TI = SRC_WIDTH not set ie 32 bit transfer
    .word   0x7e003004                                  // SRC = phys addr of system_timer.CLO
    .word   0x00100000                                  // DEST= save to phys addr of 1Mbyte
    .word   0x00400000                                  // LEN = transfer 4 Mbytes ie. 1 Mwords
    .word   0                                           // STRIDE
    .word   0                                           // NEXT-CB
    .word   0                                           // padding
    .word   0                                           // padding

dma_cbk2:
    .word   1<<DMA_DEST_INC | 9<<DMA_BURST_LENGTH           // SRC_WIDTH not set ie 32 bit transfer
    .word   0x7e200034                  // VC bus addr of GPLEV0 = values of GPIO pins 0-31
    .word   0x00100000
    .word   0x00400000
    .word   0
    .word   =dma_cbk2
    .word   0
    .word   0

dma_cbk3:
    .word   1<<DMA_DEST_INC | 1<<DMA_DEST_WIDTH | 1<<DMA_SRC_WIDTH | 1<<DMA_SRC_INC | 9<<DMA_BURST_LENGTH
    .word   0x00100000
    .word   0x00200000
    .word   0x00100000
    .word   0
    .word   0
    .word   0
    .word   0

dma_cbk:
    .word   1<<DMA_TDMODE | 1<<DMA_DEST_INC | 1<<DMA_DEST_WIDTH | 1<<DMA_SRC_WIDTH | 1<<DMA_SRC_INC | 9<<DMA_BURST_LENGTH
    .word   0x48006000
    .word   0x48006000 + 800*2
    .word   1024<<DMA_YLENGTH | (512 * 2)<<DMA_XLENGTH
    .word   ((1920-512)*2)<<DMA_D_STRIDE | ((1920-512)*2)<<DMA_S_STRIDE
    .word   0
    .word   0
    .word   0


Return to “Bare metal, Assembly language”