hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Tue Nov 10, 2020 5:26 pm

Hi,

I've been using the DWC2 USB driver on a Pi 3 B+ for a while, and a charity I volunteer at is also using this option for several Pi Zeros.

I recently deployed an old Pi 1 Model B rev 1 (256 MB RAM) and have found that using DWC2 is causing lockups and errors in dmesg (see below screenshots). Both pis are automatically updated, and the SD card for the Pi 1 is a clone of the card from the Pi 3 - shouldn't be a configuration issue, I think.

This seems to happen especially during high USB I/O (I was running zerofree on a connected external HDD). I am using a USB y-cable with a 1A power supply for the HDD. Note that the dwc2 error occurred later when I was no longer running zerofree and the system was idle.

My kernel version is 5.4.72+ on the Pi 1 B, and 5.4.72-v8+ on the Pi 3 B+ (I'm using the 64-bit kernel for BOINC computing).

Has anyone else experienced this, and can anyone potentially help me find the cause? I can mess around with this pi without an issue, but the ones the charity uses need to work in a water sustainability project so if they start to experience issues it will be a problem.

Kernel OOPS messages are below. Please note I've now been running for a few weeks without DWC2 and have had no issues - I'm pretty sure this is a driver issue, not a power issue.

Thanks,
Hamish
Last edited by hamishtmb on Mon Nov 23, 2020 8:58 pm, edited 2 times in total.

hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel panic on Raspberry Pi 1 Model B (256 MB RAM)

Mon Nov 23, 2020 8:39 pm

I've decided to write the content of the screenshots up - makes it quite a lot easier to see what's going on. I have substituted the uptime for **

Number 1:

Code: Select all


[  **] 8<--- cut here ---
[  **] Unable to handle kernel paging request at virtual address 31a60a02
[  **] [31a60a02] *pgd=00000000
[  **] Internal error: Oops: 5 [#1] ARM

Entering kdb (current=0xc49cd50, pid 1061) Oops: (null)
due to oops @ 0x064e898
CPU: o PID: 1061 Comm: zerofree Tainted: G		C		5.4.72+ #1356
Hardware name: BCM2835
PC is at skb_release_data+0x84/0x164
LR is at skb_release_all+0x30/0x34
pc : [<c064e898>]	lr : [<c064dd94>]		psr: 20000113
sp : c87c5b48	ip : c87c5b68	fp : c87c5b64
r10 : 00000000	r9: 00000000	r8: ca21ebda
r7 : ca21f3e0	r6 : c7d1bf00	r5 : ca21f3e0 	r4 : 00000000
r3 : 00000054	r2: 00000001 	r1: 00000000 	r0 : 31a609fe
Flags: nzCv	IRQs on	FIQs on	Mode SVC_32	ISA ARM	Segment user
Control: 00c5387d	Table: 0a1b8008	DAC:	00000055
CPU: o PID: 1061 Comm: zerofree Tainted: G		C		5.4.72+ #1356
Hardware name: BCM2835
Backtrace:
[<c0015850>] (dump_backtrace) from [<c0015bb8>] (show_stack+0x20/0x24) r7:0000000f r6: c0b84848 r5:c0ab2660 r4:c0b84844
[<c0015b98>] (show_stack) from [<c0795540>] (dump_stack+0x20/0x28)
[<c0795520>] (dump_stack) from [<c00118c4>] (show_regs+0x1c/0x20)
[<c00118a8>] (show_regs) from [<c00c5878>] (kdb_main_loop+0x38c/0x89c)
[<c00c54ec>] (kdb_main_loop) from [<c00c86f0>] (kdb_stub+0x1ec/0x3f8)
more>


hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel panic on Raspberry Pi 1 Model B (256 MB RAM)

Mon Nov 23, 2020 8:54 pm

#2:

Code: Select all


[  **] 8<--- cut here ---
[  **] Unable to handle kernel NULL pointer dereference at virtual address 0000000sp : c0a5de680
[  **] pgd = 1f1d19ee
[  **] [00000000] *pgd=00000000
[  **] Internal error: OOps: 17 [#1] ARM

Entering kdr (current=0xc392c740, pd 30935) Oops: (null) due to oops @ 0xc0198b70
CPU: 0 PID: 30935 Comm: x11vnc Tainted: G			C		5.4.72+ #1356
Hardware name: BCM2835
PC is at unlink_anon_vmas+0x74/0x208
LR is at free_pgtables+0x3c/0cx8
pc : [<c0198b70>]	lr : [<c018aaa0>]		psr: 80000013
sp : c0a5de68	ip : c0a5dea0	fp : c0a5de9c
r10 : 00000122	r9: c3969f3c		r8: 00000000
r7 : 00000000 	r6 : fffffff8 	r5 : 00000000 	r4 : c87000c0
r3 : ffffffff 	r2 : 0000007d 	r1: 00201000 	r0: c396f00
Flags: Nzcv	IRQs on	FIQs on	Mode SVC_32	ISA ARM	Segment user
Control: 00c5387d	Table: 045e0008	DAC: 00000055
CPU: 0 PID: 30935 Comm: x11vnc Tainted: G			C		5.4.72+ #1356
Hardware name: BCM2835
Backtrace:
[<c0015850>] (dump_backtract) from [<c0015bb8>] (show_stack+0x20/0x24) r7:0000000f r6:c0b84848 r5:c0ab2660 r4:c0b84844
[<c0015b98>] (show_stack) from [<c0795540>] (dump_stack+0x20/0x28)
[<c0795520>] (dump_stack) from [<c00118c4>] (show_regs+0x1c/0x20)
[<c00118a8>] (show_regs) from [<c00c5878>] (kdb_main_loop+0x38c/0x89c)
[<c00c54ec>] (kdb_main_loop) from [<c00c86f0>] (kdb_stub+0x1ec/0x3f8)
more>


jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2551
Joined: Thu Jul 11, 2013 2:37 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Tue Nov 24, 2020 10:57 am

Your stack traces are incomplete. I would recommend removing any "kdb" or "kgdboc" parameters from cmdline.txt as the kernel is waiting for console input to print the rest of the trace (which is going to be hard because USB has gone away).

That said, it does look a lot like memory corruption (use-after-free) is causing the crash because in both of the partial traces, the PC is in two unrelated kernel functions.

Another thing to try is adding "slub_debug=FPUZ" to cmdline.txt. It should complain very loudly if kernel memory is being trampled.
Rockets are loud.
https://astro-pi.org

hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Tue Nov 24, 2020 3:17 pm

Thanks for the help.

I have removed the kgdbox option and added the slub_debug option as you suggested.

I wonder if this might be an issue only with the original RPI - it's working fine on my RPI 3 B+ and several Pi Zeros. Regardless, worth trying to look into it.

Is there any way for me to dump the OOPS information to a file, so I don't have to screenshot it and type it up manually? It's a bit time consuming, but I'm also worried I'll mess up some of the hexadecimal code and accidentally obscure something or lead to false positives.
I'll get it running zerofree and reply again when it crashes.

Hamish

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2551
Joined: Thu Jul 11, 2013 2:37 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 12:04 pm

I would suggest taking a photo of the screen and uploading to an image hosting site.
Rockets are loud.
https://astro-pi.org

User avatar
thagrol
Posts: 4237
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 1:01 pm

Far as I know there is no advantage to using the dwc2 driver over the default dwc one when any model Pi is operating as a USB host.

In fact there was a time when it not recommended, performance as a host was worse with dwc2, IIRC it was an issue related to interrupts.

When not using the Pi as a USB device/gadget (which you can't on any B model Pi*) you should be using the default dwc driver.

*: 4B excepted. It has two USB controllers.
Arguing with strangers on the internet since 1993.

All advice given is based on my experience. it worked for me, it may not work for you.
All GPIO pin numbers are BCM numbers.

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2551
Joined: Thu Jul 11, 2013 2:37 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 1:09 pm

thagrol wrote:
Wed Nov 25, 2020 1:01 pm
Far as I know there is no advantage to using the dwc2 driver over the default dwc one when any model Pi is operating as a USB host.

In fact there was a time when it not recommended, performance as a host was worse with dwc2, IIRC it was an issue related to interrupts.

When not using the Pi as a USB device/gadget (which you can't on any B model Pi*) you should be using the default dwc driver.

*: 4B excepted. It has two USB controllers.
If there's a bug in dwc2, then it's worth finding. There is no FIQ support on aarch64 which means dwc_otg is necessarily deprecated on that architecture.
Rockets are loud.
https://astro-pi.org

hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 1:25 pm

Okay, it crashed again, and I've done as you said and posted at: https://postimg.cc/3kzZymCF

Unfortunately I think this time the top of the trace might be missing because the terminal scrolled. Hopefully nothing important is missing.

It seems to be slightly different each time, so if you need more crashes I'm happy to provide them.

User avatar
thagrol
Posts: 4237
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 1:26 pm

jdb wrote:
Wed Nov 25, 2020 1:09 pm
thagrol wrote:
Wed Nov 25, 2020 1:01 pm
Far as I know there is no advantage to using the dwc2 driver over the default dwc one when any model Pi is operating as a USB host.

In fact there was a time when it not recommended, performance as a host was worse with dwc2, IIRC it was an issue related to interrupts.

When not using the Pi as a USB device/gadget (which you can't on any B model Pi*) you should be using the default dwc driver.

*: 4B excepted. It has two USB controllers.
If there's a bug in dwc2, then it's worth finding. There is no FIQ support on aarch64 which means dwc_otg is necessarily deprecated on that architecture.
Fair point and one I wasn't aware of. But AIUI (please correct me if I'm wrong) the 1B (and zero) do not support aarch64.
Arguing with strangers on the internet since 1993.

All advice given is based on my experience. it worked for me, it may not work for you.
All GPIO pin numbers are BCM numbers.

hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 1:29 pm

Yeah, I'm just using it cos I had to use the aarch64 kernel for a particular use case on a Pi 3B+, and since heard the dwc2 driver is generally more reliable, if a bit slower. This seems to be reflected in a network of Pi Zeros I help administer and maintain that had weird intermittent issues a few years back on the default driver (though we never found the cause for sure, I suspect it was that).

In the use cases I have, I prefer it being generally more reliable over speed.

Either way, I just want to help find bugs (and potentially help debug too, but I've never debugged kernel issues before so I may need handholding) :lol: .

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2551
Joined: Thu Jul 11, 2013 2:37 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Nov 25, 2020 2:17 pm

There are 2 crashes in skb_release_data which is a network-related function.

Does the crash still occur (or change behaviour) if you run the offending command with the network cable disconnected?
Rockets are loud.
https://astro-pi.org

hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Fri Nov 27, 2020 1:24 pm

Okay, so the behaviour is different without the ethernet plugged in, and it didn't crash outright. Instead I got multiple kernel OOPSes.

Unfortunately, as soon as I plugged a USB stick in to save the output from dmesg it locked up. Should have just saved to SD card, don't know what I was thinking.

I do have a picture that shows part of the data, which I'll upload in a minute, but I'm going to try and make it happen again as well (with and without ethernet) and collect some better data.


hamishtmb
Posts: 9
Joined: Tue Nov 10, 2020 4:59 pm

Re: dtoverlay=dwc2 causing errors and kernel OOPS on Raspberry Pi 1 Model B (256 MB RAM)

Wed Dec 09, 2020 7:27 pm

This system is more in use at the moment and over Christmas, so I'll probably not be able to mess around with this for a bit.

What is worth noting is that this is the very first hardware version, as reported by "cat /proc/cpuinfo":

Code: Select all

processor	: 0
model name	: ARMv6-compatible processor rev 7 (v6l)
BogoMIPS	: 697.95
Features	: half thumb fastmult vfp edsp java tls 
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xb76
CPU revision	: 7

Hardware	: BCM2835
Revision	: 0002
Serial		: 00000000d73b44ad
Model		: Raspberry Pi Model B Rev 1
So perhaps this could be a hardware bug?

Unfortunately I've no guarantee that it will crash again in a way that makes it possible for me to save the output - last time it was when shutting down, and I don't know how/if I can make it dump debugging information to the SD card. If I can't, it seems likely that the top of the output is unfortunately always going to be cut off. Even if I can't do it now, I'm quite happy to learn and prep everything for when I can do more diagnostics.

Any ideas?

Hamish

Return to “Troubleshooting”