User avatar
PeterO
Posts: 5137
Joined: Sun Jul 22, 2012 4:14 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 7:51 am

From https://github.com/raspberrypi/firmware ... 4376902cbd
firmware: platform: Pi3 B+ reduce sdram freq to 450 while investigations are ongoing
See: viewtopic.php?f=28&t=208821
A bit of a "hit" for those of use with machines that are stable at 500Mhz :(

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24139
Joined: Sat Jul 30, 2011 7:41 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 8:13 am

PeterO wrote:
Fri Apr 20, 2018 7:51 am
From https://github.com/raspberrypi/firmware ... 4376902cbd
firmware: platform: Pi3 B+ reduce sdram freq to 450 while investigations are ongoing
See: viewtopic.php?f=28&t=208821
A bit of a "hit" for those of use with machines that are stable at 500Mhz :(

PeterO
It would be interesting to see performance figures comparing the settings, I suspect very little difference.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jahboater
Posts: 4835
Joined: Wed Feb 04, 2015 6:38 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 8:38 am

PeterO wrote:
Fri Apr 20, 2018 7:51 am
A bit of a "hit" for those of use with machines that are stable at 500Mhz :(
You can always add to /boot/config.txt

sdram_freq=500

The schmoo setting is then automatically added.

Mine seems to be completely stable at 500Mhz after various stress tests.
But I do have a 15mm cube heatsink on the memory chip.

e-raser
Posts: 71
Joined: Sat Apr 04, 2015 1:30 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 10:48 am

Mine is dead again, all services down. Probably in emergency mode after automatic reboot (no console access currently). And I‘m running latest updates with SDRAM = 450 MHz. Not solved, not even a workaround. When I swap the SD card into my good old gorgeously stable Pi 2, system runs like forever, Pi 3 B+ only made ~ 12 hours since the apt updates initiated (manual) reboot yesterday.

This. Really. Sucks.
1x Nextcloud & Pi-hole & ... on Raspbian @ Pi (4 4 GB)
1x Kodi media center on LibreELEC @ Pi 3 B+

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24139
Joined: Sat Jul 30, 2011 7:41 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 12:39 pm

e-raser wrote:
Fri Apr 20, 2018 10:48 am
Mine is dead again, all services down. Probably in emergency mode after automatic reboot (no console access currently). And I‘m running latest updates with SDRAM = 450 MHz. Not solved, not even a workaround. When I swap the SD card into my good old gorgeously stable Pi 2, system runs like forever, Pi 3 B+ only made ~ 12 hours since the apt updates initiated (manual) reboot yesterday.

This. Really. Sucks.
We know it's not solved for everyone, and we are still working on it. Dropping SDRAM to 400, does that help?
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

e-raser
Posts: 71
Joined: Sat Apr 04, 2015 1:30 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 2:06 pm

jamesh wrote:
Fri Apr 20, 2018 12:39 pm
e-raser wrote:
Fri Apr 20, 2018 10:48 am
Mine is dead again, all services down. Probably in emergency mode after automatic reboot (no console access currently). And I‘m running latest updates with SDRAM = 450 MHz. Not solved, not even a workaround. When I swap the SD card into my good old gorgeously stable Pi 2, system runs like forever, Pi 3 B+ only made ~ 12 hours since the apt updates initiated (manual) reboot yesterday.

This. Really. Sucks.
We know it's not solved for everyone, and we are still working on it. Dropping SDRAM to 400, does that help?
I have a 2nd Pi 3 B+ here now and will try this one first. Unfortunately I think we owners can’t check if two Pi‘s are from the same batch right? Well at least I have different resellers and different order times, so finger‘s crossed. But the 1st one will very likely be sent back to the seller (if you don’t need it for analyzing).
1x Nextcloud & Pi-hole & ... on Raspbian @ Pi (4 4 GB)
1x Kodi media center on LibreELEC @ Pi 3 B+

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24139
Joined: Sat Jul 30, 2011 7:41 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 2:42 pm

e-raser wrote:
Fri Apr 20, 2018 2:06 pm
jamesh wrote:
Fri Apr 20, 2018 12:39 pm
e-raser wrote:
Fri Apr 20, 2018 10:48 am
Mine is dead again, all services down. Probably in emergency mode after automatic reboot (no console access currently). And I‘m running latest updates with SDRAM = 450 MHz. Not solved, not even a workaround. When I swap the SD card into my good old gorgeously stable Pi 2, system runs like forever, Pi 3 B+ only made ~ 12 hours since the apt updates initiated (manual) reboot yesterday.

This. Really. Sucks.
We know it's not solved for everyone, and we are still working on it. Dropping SDRAM to 400, does that help?
I have a 2nd Pi 3 B+ here now and will try this one first. Unfortunately I think we owners can’t check if two Pi‘s are from the same batch right? Well at least I have different resellers and different order times, so finger‘s crossed. But the 1st one will very likely be sent back to the seller (if you don’t need it for analyzing).
I'll check to see if we need any more, but we have a few in house now, so unlikely. Return to supplier will probably be the best bet. Sorry about that - hope the new one behaves better!

I do wonder (guessing here) if it's down to the particular wafer that the SoC came from.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

User avatar
bensimmo
Posts: 4187
Joined: Sun Dec 28, 2014 3:02 pm
Location: East Yorkshire

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 2:56 pm

Definitely the SoC and not the RAM?

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24139
Joined: Sat Jul 30, 2011 7:41 pm

Re: Raspberry Pi 3 B+ lockups

Fri Apr 20, 2018 6:27 pm

bensimmo wrote:
Fri Apr 20, 2018 2:56 pm
Definitely the SoC and not the RAM?
I'm no very low level HW expert, so I'll pass on answering that one!
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2152
Joined: Thu Jul 11, 2013 2:37 pm

Re: Raspberry Pi 3 B+ lockups

Sat Apr 21, 2018 11:42 am

bensimmo wrote:
Fri Apr 20, 2018 2:56 pm
Definitely the SoC and not the RAM?
When we do qualification testing of a new piece of silicon, we make boards with split lots on. These are special SoCs that have been "skewed" in the semiconductor manufacturing process to emulate the maximum differences in performance/speed/power that you would get with production parts.

The split lots are skewed fast/slow and you can also buy qualification samples of LPDDR2 that are also skewed fast/slow. By testing the matrix of possibilities (FF/SF/FS/SS for RAM/SoC) you map out the performance at each "corner" of the semiconductor process. The issue we're seeing here didn't pop up in pre-production testing, which makes it look a lot more like a batch failure or some other cluster-type issue.

The fact that people are getting two Pis in a row from the same supplier and both exhibit the issue would be extremely unlikely if the failure was randomly distributed.
Rockets are loud.
https://astro-pi.org

macmpi
Posts: 38
Joined: Tue Dec 15, 2015 9:39 pm

Re: Raspberry Pi 3 B+ lockups

Sat Apr 21, 2018 4:19 pm

Hum...all this seems to rule-out board design/manufacturing rather quickly, no?

Anyhow, let's praise a SW fix to the rescue: thanks for all the hard work!

e-raser
Posts: 71
Joined: Sat Apr 04, 2015 1:30 pm

Re: Raspberry Pi 3 B+ lockups

Sun Apr 22, 2018 7:57 pm

Hi,

just a short "summary" (even it´s maybe a bit too early after only 2 days):

1) Switched to new (2nd) Pi 3 B+, manually set sdram freq back to 500 MHz

2) 20:50 20.04.2018: "sudo MEMTESTER_TEST_MASK=0x1000 memtester 128M"
--> ran for 15 hours with 954 loops - completely stable, no reboots

3) 12:33 21.04.2018: "sudo memtester 512M"
--> after 27 minutes RAM issues (SSH screen session frozen @ "Loop 3, Random Value", system still up - but veeeeeeery slow, maybe due to swapping, according to free -h no RAM left)
--> after 50 minutes @ Loop 4/Walking Ones I aborted the test, system was up but extremely laggy/slow and some services came down (nginx, mysql, ...) - after abortion of memtester everything fine

4) So the new Pi seems to be ways better.
The only thing I still see - but that might be memory/swapping (1.5 GB file) related:

/var/log/kern.log (also shown on console screen)

Code: Select all

Apr 22 16:03:12 raspberry kernel: [61359.155507] swapper/0: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC), nodemask=(null)
Apr 22 16:03:15 raspberry kernel: [61359.251047] swapper/0 cpuset=/ mems_allowed=0
Apr 22 16:03:15 raspberry kernel: [61359.323400] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.14.34-v7+ #1110
Apr 22 16:03:15 raspberry kernel: [61359.384753] Hardware name: BCM2835
Apr 22 16:03:15 raspberry kernel: [61359.445324] [<8010ffd8>] (unwind_backtrace) from [<8010c240>] (show_stack+0x20/0x24)
Apr 22 16:03:15 raspberry kernel: [61359.506234] [<8010c240>] (show_stack) from [<807840a4>] (dump_stack+0xd4/0x118)
Apr 22 16:03:15 raspberry kernel: [61359.567230] [<807840a4>] (dump_stack) from [<80228268>] (warn_alloc+0xcc/0x17c)
Apr 22 16:03:15 raspberry kernel: [61359.627754] [<80228268>] (warn_alloc) from [<80229428>] (__alloc_pages_nodemask+0x105c/0x11e0)
Apr 22 16:03:15 raspberry kernel: [61359.688349] [<80229428>] (__alloc_pages_nodemask) from [<8027491c>] (new_slab+0x454/0x558)
Apr 22 16:03:15 raspberry kernel: [61359.749393] [<8027491c>] (new_slab) from [<80276760>] (___slab_alloc.constprop.11+0x228/0x2c0)
Apr 22 16:03:15 raspberry kernel: [61359.810705] [<80276760>] (___slab_alloc.constprop.11) from [<8027683c>] (__slab_alloc.constprop.10+0x44/0x90)
Apr 22 16:03:15 raspberry kernel: [61359.872349] [<8027683c>] (__slab_alloc.constprop.10) from [<80276fd4>] (kmem_cache_alloc+0x1f4/0x230)
Apr 22 16:03:15 raspberry kernel: [61359.934059] [<80276fd4>] (kmem_cache_alloc) from [<806735e8>] (__alloc_skb+0x4c/0x144)
Apr 22 16:03:16 raspberry kernel: [61359.995596] [<806735e8>] (__alloc_skb) from [<806771e8>] (__netdev_alloc_skb+0x50/0x158)
Apr 22 16:03:16 raspberry kernel: [61360.057025] [<806771e8>] (__netdev_alloc_skb) from [<8059ec2c>] (rx_submit.constprop.8+0x34/0x1e4)
Apr 22 16:03:16 raspberry kernel: [61360.118575] [<8059ec2c>] (rx_submit.constprop.8) from [<8059ef80>] (rx_complete+0x1a4/0x1a8)
Apr 22 16:03:16 raspberry kernel: [61360.180008] [<8059ef80>] (rx_complete) from [<805aefe0>] (__usb_hcd_giveback_urb+0x80/0x160)
Apr 22 16:03:16 raspberry kernel: [61360.241375] [<805aefe0>] (__usb_hcd_giveback_urb) from [<805af10c>] (usb_hcd_giveback_urb+0x4c/0xfc)
Apr 22 16:03:16 raspberry kernel: [61360.302644] [<805af10c>] (usb_hcd_giveback_urb) from [<805d9264>] (completion_tasklet_func+0x6c/0x98)
Apr 22 16:03:16 raspberry kernel: [61360.363989] [<805d9264>] (completion_tasklet_func) from [<805e83b4>] (tasklet_callback+0x20/0x24)
Apr 22 16:03:16 raspberry kernel: [61360.425505] [<805e83b4>] (tasklet_callback) from [<80123bbc>] (tasklet_hi_action+0x74/0x10c)
Apr 22 16:03:16 raspberry kernel: [61360.487122] [<80123bbc>] (tasklet_hi_action) from [<80101694>] (__do_softirq+0x18c/0x3d8)
Apr 22 16:03:16 raspberry kernel: [61360.548731] [<80101694>] (__do_softirq) from [<80123794>] (irq_exit+0xe0/0x144)
Apr 22 16:03:16 raspberry kernel: [61360.610510] [<80123794>] (irq_exit) from [<80175534>] (__handle_domain_irq+0x70/0xc4)
Apr 22 16:03:16 raspberry kernel: [61360.672536] [<80175534>] (__handle_domain_irq) from [<80101504>] (bcm2836_arm_irqchip_handle_irq+0xa8/0xac)
Apr 22 16:03:16 raspberry kernel: [61360.734882] [<80101504>] (bcm2836_arm_irqchip_handle_irq) from [<8079fcbc>] (__irq_svc+0x5c/0x7c)
Apr 22 16:03:16 raspberry kernel: [61360.797532] Exception stack(0x80c01ef0 to 0x80c01f38)
Apr 22 16:03:16 raspberry kernel: [61360.860571] 1ee0:                                     00000000 ee03b258 3a3a9000 00000000
Apr 22 16:03:16 raspberry kernel: [61360.923947] 1f00: 80c00000 80c03dcc 80c03d68 80c88172 00000001 80b60a30 bb7ffa40 80c01f4c
Apr 22 16:03:16 raspberry kernel: [61360.987636] 1f20: 80c04174 80c01f40 80108a4c 80108a50 60000013 ffffffff
Apr 22 16:03:16 raspberry kernel: [61361.051417] [<8079fcbc>] (__irq_svc) from [<80108a50>] (arch_cpu_idle+0x34/0x4c)
Apr 22 16:03:16 raspberry kernel: [61361.115545] [<80108a50>] (arch_cpu_idle) from [<8079f434>] (default_idle_call+0x34/0x48)
Apr 22 16:03:16 raspberry kernel: [61361.180149] [<8079f434>] (default_idle_call) from [<801611cc>] (do_idle+0xd8/0x150)
Apr 22 16:03:16 raspberry kernel: [61361.243845] [<801611cc>] (do_idle) from [<801614e0>] (cpu_startup_entry+0x28/0x2c)
Apr 22 16:03:16 raspberry kernel: [61361.306412] [<801614e0>] (cpu_startup_entry) from [<80799184>] (rest_init+0xbc/0xc0)
Apr 22 16:03:16 raspberry kernel: [61361.367752] [<80799184>] (rest_init) from [<80b00df8>] (start_kernel+0x3d4/0x3e0)
Apr 22 16:03:16 raspberry kernel: [61361.428015] Mem-Info:
Apr 22 16:03:16 raspberry kernel: [61361.487314] active_anon:111703 inactive_anon:111675 isolated_anon:183
Apr 22 16:03:16 raspberry kernel: [61361.487314]  active_file:528 inactive_file:624 isolated_file:0
Apr 22 16:03:16 raspberry kernel: [61361.487314]  unevictable:440 dirty:0 writeback:8465 unstable:0
Apr 22 16:03:16 raspberry kernel: [61361.487314]  slab_reclaimable:4247 slab_unreclaimable:4666
Apr 22 16:03:16 raspberry kernel: [61361.487314]  mapped:4424 shmem:4135 pagetables:2055 bounce:0
Apr 22 16:03:16 raspberry kernel: [61361.487314]  free:944 free_pcp:328 free_cma:0
Apr 22 16:03:16 raspberry kernel: [61361.835271] Node 0 active_anon:446812kB inactive_anon:446700kB active_file:2112kB inactive_file:2496kB unevictable:1760kB isolated(anon):732kB isolated(file):0kB mapped:17696kB dirty:0kB writeback:33860kB shmem:16540kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Apr 22 16:03:16 raspberry kernel: [61361.955015] Normal free:3776kB min:3900kB low:4872kB high:5844kB active_anon:446812kB inactive_anon:446700kB active_file:2076kB inactive_file:2472kB unevictable:1760kB writepending:33980kB present:983040kB managed:961632kB mlocked:1760kB kernel_stack:2696kB pagetables:8220kB bounce:0kB free_pcp:1312kB local_pcp:352kB free_cma:0kB
Apr 22 16:03:16 raspberry kernel: [61362.146908] lowmem_reserve[]: 0 0
Apr 22 16:03:16 raspberry kernel: [61362.211899] Normal: 10*4kB (H) 10*8kB (H) 10*16kB (H) 4*32kB (H) 4*64kB (H) 1*128kB (H) 1*256kB (H) 1*512kB (H) 0*1024kB 1*2048kB (H) 0*4096kB = 3608kB
Apr 22 16:03:16 raspberry kernel: [61362.277609] 24605 total pagecache pages
Apr 22 16:03:16 raspberry kernel: [61362.343908] 18973 pages in swap cache
Apr 22 16:03:16 raspberry kernel: [61362.409428] Swap cache stats: add 909699, delete 890700, find 9709847/10167255
Apr 22 16:03:16 raspberry kernel: [61362.474711] Free swap  = 1013500kB
Apr 22 16:03:16 raspberry kernel: [61362.539562] Total swap = 1572860kB
Apr 22 16:03:16 raspberry kernel: [61362.603691] 245760 pages RAM
Apr 22 16:03:16 raspberry kernel: [61362.668193] 0 pages HighMem/MovableOnly
Apr 22 16:03:16 raspberry kernel: [61362.732620] 5352 pages reserved
Apr 22 16:03:16 raspberry kernel: [61362.796170] 2048 pages cma reserved
Apr 22 16:03:16 raspberry kernel: [61362.858719] SLUB: Unable to allocate memory on node -1, gfp=0x1080020(GFP_ATOMIC)
Apr 22 16:03:16 raspberry kernel: [61362.922256]   cache: kmalloc-192, object size: 192, buffer size: 192, default order: 0, min order: 0
Apr 22 16:03:16 raspberry kernel: [61362.986585]   node 0: slabs: 616, objs: 12936, free: 0
1x Nextcloud & Pi-hole & ... on Raspbian @ Pi (4 4 GB)
1x Kodi media center on LibreELEC @ Pi 3 B+

User avatar
bensimmo
Posts: 4187
Joined: Sun Dec 28, 2014 3:02 pm
Location: East Yorkshire

Re: Raspberry Pi 3 B+ lockups

Sun Apr 22, 2018 9:16 pm

jdb wrote:
Sat Apr 21, 2018 11:42 am
bensimmo wrote:
Fri Apr 20, 2018 2:56 pm
Definitely the SoC and not the RAM?
When we do qualification testing of a new piece of silicon, we make boards with split lots on. These are special SoCs that have been "skewed" in the semiconductor manufacturing process to emulate the maximum differences in performance/speed/power that you would get with production parts.

The split lots are skewed fast/slow and you can also buy qualification samples of LPDDR2 that are also skewed fast/slow. By testing the matrix of possibilities (FF/SF/FS/SS for RAM/SoC) you map out the performance at each "corner" of the semiconductor process. The issue we're seeing here didn't pop up in pre-production testing, which makes it look a lot more like a batch failure or some other cluster-type issue.

The fact that people are getting two Pis in a row from the same supplier and both exhibit the issue would be extremely unlikely if the failure was randomly distributed.
Thanks, (is that the shmoo plot that you plot out ?)

My RAM question was with reference to Jamesh's "I do wonder (guessing here) if it's down to the particular wafer that the SoC came from.".
I asked as most seem to be asking for RAM 'slowing down' and just wondered.

edit: typo on shmoo.
Last edited by bensimmo on Tue Apr 24, 2018 3:38 pm, edited 1 time in total.

YorkshireTyke
Posts: 11
Joined: Wed May 24, 2017 12:28 pm
Location: Cambs. U.K.

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 7:01 am

In reply....
jamesh wrote:
Have you tried the 450 SDRAM freq setting and run memtester to exercise it?
Over the last couple of days I have been doing some more testing.
(1) flashed NOOBS onto an SDcard, sudo apt-get update/upgrade, then ran memtester. In crashed within 5 minutes!
(2) As above but setting arm_freq=1200 & sdram_freq=450. Running memtester for about 7 hours.
(3) As above but attached my IQaudio PiDAC+ hat. Using it normally as I would as my media player for streaming BBC radio programmes, podcasts & BBC iPlayer catch-up watching a film or two. So far no problems.

So what is the underlying problem that setting arm_freq & sdram_freq seems to fix and is this problem only limited to a few RPi 3B+ boards?

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24139
Joined: Sat Jul 30, 2011 7:41 pm

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 8:57 am

YorkshireTyke wrote:
Tue Apr 24, 2018 7:01 am
In reply....
jamesh wrote:
Have you tried the 450 SDRAM freq setting and run memtester to exercise it?
Over the last couple of days I have been doing some more testing.
(1) flashed NOOBS onto an SDcard, sudo apt-get update/upgrade, then ran memtester. In crashed within 5 minutes!
(2) As above but setting arm_freq=1200 & sdram_freq=450. Running memtester for about 7 hours.
(3) As above but attached my IQaudio PiDAC+ hat. Using it normally as I would as my media player for streaming BBC radio programmes, podcasts & BBC iPlayer catch-up watching a film or two. So far no problems.

So what is the underlying problem that setting arm_freq & sdram_freq seems to fix and is this problem only limited to a few RPi 3B+ boards?
Not sure of the exact reason - I'm not a silicon level HW engineer! Probably a timing/voltage issue between the memory controller and the SDRAM.

It is limited to a few boards - perhaps with chips made from a single wafer that was slightly out of spec? I dunno.

Anyway, we are still tweaking the numbers to try and get as many working correctly as possible. It might be possible that some will be too far out of spec to get working reliably, I suspect all those will be replaced.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2152
Joined: Thu Jul 11, 2013 2:37 pm

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 9:55 am

bensimmo wrote:
Sun Apr 22, 2018 9:16 pm


Thanks, (is that the scmoo plot that you plot out ?)

My RAM was with reference to Jamesh's "I do wonder (guessing here) if it's down to the particular wafer that the SoC came from.".
I asked as most seem to be asking for RAM 'slowing down' and just wondered.
The Shmoo plot maps out the "stable area" of a set of N adjustable hardware parameters - the config.txt sdram_schmoo=0xN setting twiddles various low-level bits inside the SDRAM PHY to adjust timings/thresholds/drive strengths. Iterating over these settings (which are usually not entirely independent of each other) lets you build a plot of stability across the various knobs, chip silicon speed and voltage.

The fact that some boards are stable only with reduced arm_freq *and* sdram_freq is likely an example of interdependence (or a fault that causes interdependence) - the ARM cores are nowhere near the SDRAM controller on the die and derive clocks from different sources.
Rockets are loud.
https://astro-pi.org

User avatar
bensimmo
Posts: 4187
Joined: Sun Dec 28, 2014 3:02 pm
Location: East Yorkshire

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 3:42 pm

It may be a royal pain in the head for you and time consuming (especially after launch) but it's quite interesting from my point of view, I'm sure others are reading and following too. :-)

baallrog
Posts: 9
Joined: Mon Jul 08, 2013 7:44 pm

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 4:35 pm

Hi,

I've got the same problem here.
I've got LibreElec on the RPI3B+ and it freeze after some minutes. This is not random at all.
Adding arm_freq=1200 seems to fix this freezes for me.

If you want some details about the board like serial number or something else, I can provide it.

Hope this can help.
And thank you guys for all the work you are doing.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24139
Joined: Sat Jul 30, 2011 7:41 pm

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 4:47 pm

baallrog wrote:
Tue Apr 24, 2018 4:35 pm
Hi,

I've got the same problem here.
I've got LibreElec on the RPI3B+ and it freeze after some minutes. This is not random at all.
Adding arm_freq=1200 seems to fix this freezes for me.

If you want some details about the board like serial number or something else, I can provide it.

Hope this can help.
And thank you guys for all the work you are doing.
Does dropping the SDRAM to 450 and putting frequency back up to 1400 still work?
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

baallrog
Posts: 9
Joined: Mon Jul 08, 2013 7:44 pm

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 5:42 pm

jamesh wrote:
Tue Apr 24, 2018 4:47 pm
baallrog wrote:
Tue Apr 24, 2018 4:35 pm
Hi,

I've got the same problem here.
I've got LibreElec on the RPI3B+ and it freeze after some minutes. This is not random at all.
Adding arm_freq=1200 seems to fix this freezes for me.

If you want some details about the board like serial number or something else, I can provide it.

Hope this can help.
And thank you guys for all the work you are doing.
Does dropping the SDRAM to 450 and putting frequency back up to 1400 still work?
Adding this sdram_freq=450, doesn't help for me.
It keeps freezing.

Back to arm_freq=1200.

mushu999
Posts: 29
Joined: Sun Aug 20, 2017 11:24 pm

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 7:08 pm

The update/upgrade dropped my RAM freq to 450 but kept my SoC speed at 1400 and I've been stable for several days now, where before it always locked up within a few hours.

User avatar
blachanc
Posts: 458
Joined: Sat Jan 26, 2013 5:03 am
Location: Quebec,canada(french)

Re: Raspberry Pi 3 B+ lockups

Tue Apr 24, 2018 8:13 pm

jdb wrote:
Sat Apr 21, 2018 11:42 am
bensimmo wrote:
Fri Apr 20, 2018 2:56 pm
Definitely the SoC and not the RAM?
When we do qualification testing of a new piece of silicon, we make boards with split lots on. These are special SoCs that have been "skewed" in the semiconductor manufacturing process to emulate the maximum differences in performance/speed/power that you would get with production parts.

The split lots are skewed fast/slow and you can also buy qualification samples of LPDDR2 that are also skewed fast/slow. By testing the matrix of possibilities (FF/SF/FS/SS for RAM/SoC) you map out the performance at each "corner" of the semiconductor process. The issue we're seeing here didn't pop up in pre-production testing, which makes it look a lot more like a batch failure or some other cluster-type issue.

The fact that people are getting two Pis in a row from the same supplier and both exhibit the issue would be extremely unlikely if the failure was randomly distributed.
Agreed, but As you know, sometimes, Corner lot are not controlled as much as we wished for.
I guess you cannot (secret sauce) answer this question:
is the info about the specific wafer ID / X and Y coordinates burned in a per die OTP fuse at wafer probe?
I am asking, because that is a very valuable info to have in conjunction with the wafer process parameter.
It makes thing easier when trying to do a correlation to corner lot results.

Anyway, I feel your pain. Good luck ;)
Autism/Asperger syndrome: what is your score on this quiz?
http://www.raspberrypi.org/forums/viewtopic.php?f=62&t=70191

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5370
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: Raspberry Pi 3 B+ lockups

Wed Apr 25, 2018 7:09 pm

blachanc wrote:
Tue Apr 24, 2018 8:13 pm
I guess you cannot (secret sauce) answer this question:
is the info about the specific wafer ID / X and Y coordinates burned in a per die OTP fuse at wafer probe?
I am asking, because that is a very valuable info to have in conjunction with the wafer process parameter.
It makes thing easier when trying to do a correlation to corner lot results.
No, but I agree it would be useful information.
The pi boards do have a QR sticker on which allows tracing of place/time of production.
That may help to narrow things down, but probably not with the precision we'd like.

mushu999
Posts: 29
Joined: Sun Aug 20, 2017 11:24 pm

Re: Raspberry Pi 3 B+ lockups

Thu Apr 26, 2018 1:03 am

So then the logical next question is, what do we do if we want full usability on our defective Pi? Return it to vendor as defective, paying shipping costs both ways, and hope the new one doesn't have the same issues? Mali it to Pi HQ and they will mail back a known good one? Suffer in silence and be glad that it was only USD$35+tax+shipping?

/sad

vicw926a4
Posts: 1
Joined: Wed Apr 25, 2018 5:23 pm

Re: Raspberry Pi 3 B+ lockups

Thu Apr 26, 2018 2:17 am

I'm new to the RP world, having just received my 3B+ last week. My usage will likely be just running Vera Concierge to support Google Home on a 7/24 schedule. On my first few days of use, the RP hung or crashed after 15 to 24 hours of use. Since I updated to the 4/8/18 Raspian release, it is now into a 4th day of continuous use with no problems.

The only change in the versions I'm aware of, is the drop in the SD Card memory frequency from 500 to 450, and it looks like it may be giving me the stability I need to have. I think I can probably live with that, as long as stability continues. I could return it for exchange, hoping to get one with no issues, but right now it's possible that the replacement could be more vulnerable. Also, we're working with t $35 device, so eventually buying a replacement after inventories of the marginal ones has been exhausted, wouldn't be such a huge deal.

The good new, as I see it, is that Raspberry.org has been totally engaged in finding a viable solution, and sharing their efforts via this thread. With most new products, this kind of situation usually first involves a string of denials, and if finally addressed and resolved, it is done in a cloud of non-admissions. While it's disappointing that the 3B+ was released with this problem, I'm really impressed with their transparency dealing with it.

Return to “Troubleshooting”