The Secondary Memory Interface (SMI) is available on every Pi version, and offers a very fast parallel interface - over 80 megabytes per second on a ZeroW.
It is very rarely used, due to the lack of publicly-available documentation; I've tried to rectify that, by posting detailed information and C code to drive an R-2R DAC, and an AD9226 ADC; see https://iosoft.blog/raspberry-pi-smi/
Re: Secondary Memory Interface
great project. Thanks for sharing it. It's a shame there is not a lot of information about the SMI.
Tony
Re: Secondary Memory Interface
Great work. Interesting that there appears to be bus contention with the faster Pi boards which reduce performance.
- mikronauts
- Posts: 2821
- Joined: Sat Jan 05, 2013 7:28 pm
- Contact: Website
Re: Secondary Memory Interface
Excellent work!
Now I can start playing with the SMI interface.
Now I can start playing with the SMI interface.
http://Mikronauts.com - home of EZasPi, RoboPi, Pi Rtc Dio and Pi Jumper @Mikronauts on Twitter
Advanced Robotics, I/O expansion and prototyping boards for the Raspberry Pi
Advanced Robotics, I/O expansion and prototyping boards for the Raspberry Pi
-
- Posts: 2451
- Joined: Sat Aug 18, 2012 2:33 pm
Re: Secondary Memory Interface
if you just want output only, and can deal with the state going low at regular intervals, you can also abuse the DPI interface to do simiar:
viewtopic.php?p=1693217#p1693217
DPI will basically just take a chunk of 24bit samples in ram, and shove them out 24 gpio pins, and strobe the hsync/vsync pins after a certain number of samples and rows, its rated to have sample rates as high as 75mhz, but i have seen one user pushing it to 130mhz without even realizing the limit was there
the main strenght with DPI, is that once you configure a given waveform, it will keep repeating it with zero cpu overhead and no DMA involvement
i think with SMI, its more of an external memory bus, and you need to initiate a read or write action (possibly with the aid of DMA), but SMI can also do input, which DPI cant
i have also been wanting to investigate how SMI works, so i'll definitely be reading this more and doing some of my own experimentation
edit1: having read half the blog post, i can see how this would be of major use for talking to an fpga, or driving certain chips that use an addr+data based bus like the wiznet ive used in the past
i can definitely see how you might be able to get full 100mbit ethernet out of a pi0, by driving a wiznet via SMI, and it would leave the usb interface entirely free to do other things
edit2:
https://gist.github.com/cleverca22/205d ... 4635e1486c this gist shows the contents of /sys/kernel/debug/clk/clk_summary on an rpi4
https://rawgit.com/msperl/rpi-registers ... #CM_SMICTL
https://elinux.org/The_Undocumented_Pi
using the CM_SMICTL and CM_SMIDIV registers, you can freely change what the input clock is (any PLL, or the main crystal), and what the divisor is
it works very similarly to the GPCLK stuff in the official docs
viewtopic.php?p=1693217#p1693217
DPI will basically just take a chunk of 24bit samples in ram, and shove them out 24 gpio pins, and strobe the hsync/vsync pins after a certain number of samples and rows, its rated to have sample rates as high as 75mhz, but i have seen one user pushing it to 130mhz without even realizing the limit was there
the main strenght with DPI, is that once you configure a given waveform, it will keep repeating it with zero cpu overhead and no DMA involvement
i think with SMI, its more of an external memory bus, and you need to initiate a read or write action (possibly with the aid of DMA), but SMI can also do input, which DPI cant
i have also been wanting to investigate how SMI works, so i'll definitely be reading this more and doing some of my own experimentation
edit1: having read half the blog post, i can see how this would be of major use for talking to an fpga, or driving certain chips that use an addr+data based bus like the wiznet ive used in the past
i can definitely see how you might be able to get full 100mbit ethernet out of a pi0, by driving a wiznet via SMI, and it would leave the usb interface entirely free to do other things
edit2:
if you poke around in /sys/kernel, you can find the current rate for most of the clocks in a piThe first test of a Rpi v4 at 1 MS/s actually produced 1.5 MS/s, so the base SMI clock for RPi v4 must be 1.5 GHz. This means a new set of speed definitions:
https://gist.github.com/cleverca22/205d ... 4635e1486c this gist shows the contents of /sys/kernel/debug/clk/clk_summary on an rpi4
https://rawgit.com/msperl/rpi-registers ... #CM_SMICTL
https://elinux.org/The_Undocumented_Pi
using the CM_SMICTL and CM_SMIDIV registers, you can freely change what the input clock is (any PLL, or the main crystal), and what the divisor is
it works very similarly to the GPCLK stuff in the official docs
Re: Secondary Memory Interface
Thanks very much for the comments; having spent a month staring at SMI registers, it is good to think the code might be of some use.
I'm just a bit concerned at the catastrophic failure of SMI on the Pi 3 and 4 when the DMA doesn't keep up. The ZeroW fails gracefully under these circumstances, just dropping occasional samples, but the 3 and 4 go completely crazy with bus contentions, as if they've flipped from read to write cycles. I'd really like to understand why, before doing any high-speed project on these platforms.
Thanks for the pointers on clock frequency, and the DPI interface, which looks like the ideal way to create a simple waveform synthesiser.
I'm just a bit concerned at the catastrophic failure of SMI on the Pi 3 and 4 when the DMA doesn't keep up. The ZeroW fails gracefully under these circumstances, just dropping occasional samples, but the 3 and 4 go completely crazy with bus contentions, as if they've flipped from read to write cycles. I'd really like to understand why, before doing any high-speed project on these platforms.
Thanks for the pointers on clock frequency, and the DPI interface, which looks like the ideal way to create a simple waveform synthesiser.
-
- Posts: 2451
- Joined: Sat Aug 18, 2012 2:33 pm
Re: Secondary Memory Interface
there are some performance metrics you may want to look into
first, i found the dram usage metrics on vc4: https://github.com/librerpi/rpi-open-fi ... cc#L72-L78
IDL increases by 1 for every clock cycle the dram spent while idle, CYC increases on every clock cycle, writing 0 to IDL clears both
they are both 28 bit counters, and dont overflow when hitting 2^28, so you can easily detect when an overflow happens
if the dram is running at 400mhz, that means an overflow in just 0.67 seconds, so you need to poll it at least twice a second to get reliable data
viewtopic.php?f=29&t=274223
the AXI bus also has its own performance counters, i dont know how exactly they work, but linux has working drivers, and this thread has some info on them
i believe the DMA engine is a normal bus master on the AXI bus, and can write to any AXI slave on the SoC, though the VC6 has a more complicated bus layout, which ive not yet mapped
i also dont know what the clock rate is for the DMA engine and AXI bus, so that may also play a factor in how fast data can move around
first, i found the dram usage metrics on vc4: https://github.com/librerpi/rpi-open-fi ... cc#L72-L78
Code: Select all
void report_sdram_usage() {
uint32_t idle = SD_IDL;
uint32_t total = SD_CYC;
SD_IDL = 0;
float idle_percent = ((float)idle) / ((float)total);
printf("sdram usage: %ld %ld, %f\t", idle, total, idle_percent);
}
they are both 28 bit counters, and dont overflow when hitting 2^28, so you can easily detect when an overflow happens
if the dram is running at 400mhz, that means an overflow in just 0.67 seconds, so you need to poll it at least twice a second to get reliable data
viewtopic.php?f=29&t=274223
the AXI bus also has its own performance counters, i dont know how exactly they work, but linux has working drivers, and this thread has some info on them
i believe the DMA engine is a normal bus master on the AXI bus, and can write to any AXI slave on the SoC, though the VC6 has a more complicated bus layout, which ive not yet mapped
i also dont know what the clock rate is for the DMA engine and AXI bus, so that may also play a factor in how fast data can move around
-
- Posts: 2451
- Joined: Sat Aug 18, 2012 2:33 pm
Re: Secondary Memory Interface
HY27UF(08_16)2G2B (Rev0.2).pdf
i believe this NAND chip was used as the boot media on an older bcm2835 product (the roku2)
it held both the bootcode.bin and the whole rootfs
and i think it was driven by the SMI controller
according to that pdf, you can send a 29 bit address over 8 lines and 5 clock cycles(page 9), do you know how that works with SMI? or is it using the data lines instead of addr lines?
i believe this NAND chip was used as the boot media on an older bcm2835 product (the roku2)
it held both the bootcode.bin and the whole rootfs
and i think it was driven by the SMI controller
according to that pdf, you can send a 29 bit address over 8 lines and 5 clock cycles(page 9), do you know how that works with SMI? or is it using the data lines instead of addr lines?
Re: Secondary Memory Interface
Yes. The old Roku boxes with a BCM2835 had a NAND that was driven by the SMI peripheral. It loads VC firmware from NAND which then loads u-boot which then loads the kernel etc...cleverca22 wrote: ↑Sun Jul 19, 2020 3:36 pmi believe this NAND chip was used as the boot media on an older bcm2835 product (the roku2)
it held both the bootcode.bin and the whole rootfs
and i think it was driven by the SMI controller
-
- Posts: 2451
- Joined: Sat Aug 18, 2012 2:33 pm
Re: Secondary Memory Interface
the main question for this thread, is how does SMI drive a 29 bit address bus, with only a 6 addr pins?, what config registers are needed to send the addr over multiple clocks?trejan wrote: ↑Sun Jul 19, 2020 4:15 pmYes. The old Roku boxes with a BCM2835 had a NAND that was driven by the SMI peripheral. It loads VC firmware from NAND which then loads u-boot which then loads the kernel etc...cleverca22 wrote: ↑Sun Jul 19, 2020 3:36 pmi believe this NAND chip was used as the boot media on an older bcm2835 product (the roku2)
it held both the bootcode.bin and the whole rootfs
and i think it was driven by the SMI controller
and a secondary question that is more off-topic for this thread, where does the 2835 expect to find bootcode.bin in the nand flash, is it just msdos part tables + fat32 as usual or something more specialized?
Re: Secondary Memory Interface
NAND flash doesn't have an dedicated external address bus. Everything is multiplexed onto the data lines. The SMI address lines are used to set the various NAND control lines to tell it to latch the value as a command, address or data etc... Most of the work to talk to the NAND flash is done by the Linux MTD layer.cleverca22 wrote: ↑Sun Jul 19, 2020 4:20 pmthe main question for this thread, is how does SMI drive a 29 bit address bus, with only a 6 addr pins?, what config registers are needed to send the addr over multiple clocks?
Firmware started at offset 0.cleverca22 wrote: ↑Sun Jul 19, 2020 4:20 pmand a secondary question that is more off-topic for this thread, where does the 2835 expect to find bootcode.bin in the nand flash, is it just msdos part tables + fat32 as usual or something more specialized?
-
- Posts: 2451
- Joined: Sat Aug 18, 2012 2:33 pm
Re: Secondary Memory Interface
ahh, so they are abusing the address lines as control signals, and then just blasting addr+cmd+data out the data lines, as multiple writes, i can see how that would worktrejan wrote: ↑Sun Jul 19, 2020 5:08 pmNAND flash doesn't have an dedicated external address bus. Everything is multiplexed onto the data lines. The SMI address lines are used to set the various control lines to tell the NAND flash to latch the value as a command, address or data etc...cleverca22 wrote: ↑Sun Jul 19, 2020 4:20 pmthe main question for this thread, is how does SMI drive a 29 bit address bus, with only a 6 addr pins?, what config registers are needed to send the addr over multiple clocks?
Firmware started at offset 0.cleverca22 wrote: ↑Sun Jul 19, 2020 4:20 pmand a secondary question that is more off-topic for this thread, where does the 2835 expect to find bootcode.bin in the nand flash, is it just msdos part tables + fat32 as usual or something more specialized?
with that info, it might be possible to wire up some nand flash on some VC4 models and boot from it, i know that the VC6 removed nand boot support, so that one wont be an option
it also gives ideas on how you could use SMI for your own interfaces to things like an fpga or another pi
edit:
and given how the bootcode.bin has a 512 byte hole at the start, thats the perfect size to shove an MBR table at offset 0, followed by the rest of the bootcode.bin!, though it likely lacks wear leveling, so i would want to be careful and use a leveling fs for the partitions, but ive not played with MTD much, so i dont know how it works with raw nand and partitions
Re: Secondary Memory Interface
I don't think it is abusing it as this is what the SMI peripheral was designed for.cleverca22 wrote: ↑Sun Jul 19, 2020 5:14 pmahh, so they are abusing the address lines as control signals, and then just blasting addr+cmd+data out the data lines, as multiple writes, i can see how that would work
https://github.com/raspberrypi/linux/bl ... smi_nand.c
I wouldn't be surprised if the firmware has suffered bitrot for NAND booting as Raspberry Pi doesn't use it. There is also the question of why would you want to use NAND flash when eMMC already works and doesn't take up as many GPIOs.cleverca22 wrote: ↑Sun Jul 19, 2020 5:14 pmwith that info, it might be possible to wire up some nand flash on some VC4 models and boot from it, i know that the VC6 removed nand boot support, so that one wont be an option
SMI isn't ideal for a shared bi-directional bus as it was designed for the SoC to act as the controller and initiate all transfers. If you want to do Pi to Pi communication using SMI then your FPGA would need to implement a mailbox for each Pi to poll and a chunk of shared memory to pass messages. The overhead of all of this will greatly reduce throughput and you might as well use USB by that point.cleverca22 wrote: ↑Sun Jul 19, 2020 5:14 pmit also gives ideas on how you could use SMI for your own interfaces to things like an fpga or another pi
SMI is much more useful with the SoC acting as a controller for a high speed parallel interface peripheral such as the ADC in jayben's writeup.
There is no partition table for MTD and the kernel for your device will have a hardcoded table to know where each segment starts/ends. A MBR is not needed.cleverca22 wrote: ↑Sun Jul 19, 2020 5:14 pmand given how the bootcode.bin has a 512 byte hole at the start, thats the perfect size to shove an MBR table at offset 0, followed by the rest of the bootcode.bin!, though it likely lacks wear leveling, so i would want to be careful and use a leveling fs for the partitions, but ive not played with MTD much, so i dont know how it works with raw nand and partitions
MTD and NAND flash is off-topic for this thread anyway.