alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

reboot still running: why won't my pi0 reboot? I've waited days!

Sat May 01, 2021 3:20 pm

tldr; if reboot -f doesn't work, what's the next safest option?

I have a headless pi0 setup as a remote NAS in another state and it stopped responding to SSH a week ago. It's done this before so I set up a cron script to curl and execute a url on a server I own once a day (mother of all security holes? ;-)

so I told it to reboot -f

but it doesn't. I "cleverly" found I could use curl and base64 to encode the output of commands as nonsense URLs that I could grep from my weblog so I could probe what's going on (urls are limited to 2048 characters, which is a lot, and you can gzip them to get more!).

here's what the "last" command says:

reboot system boot 4.14.34+ Wed Dec 31 16:00 still running

if I look at dmesg I don't see anything strange

[51793.523115] sd 0:0:0:0: [sda] No Caching mode page found
[51793.523143] sd 0:0:0:0: [sda] Assuming drive cache: write through
[51793.529555] scsi 0:0:0:1: Enclosure WD SES Device 2019 PQ: 0 ANSI: 4
[51807.589279] scsi 0:0:0:1: Attached scsi generic sg1 type 13
[51807.600170] sd 0:0:0:0: [sda] Attached SCSI disk
[51808.664162] EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
[52196.647782] Indeed it is in host mode hprt0 = 00001101
[52196.857608] usb 1-1: reset high-speed USB device number 3 using dwc_otg
[52196.857809] Indeed it is in host mode hprt0 = 00001101
[55984.843529] usb 1-1: USB disconnect, device number 3

and uptime:

01:01:02 up 9 days, 2:49, 0 users, load average: 0.04, 0.03, 0.00

I've had the reboot -f command in my once-nightly script for the last week, so that last part is definite proof that it won't reboot that way.

my best guess: it wants to sync the USB connected hard drive before rebooting which has been disconnected due to a hardware failure and can't be brought back online? but I'd think that would show up in dmesg. Any way I could check that there are pending writes on the drives it knows about?
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

drtechno
Posts: 243
Joined: Fri Apr 09, 2021 6:33 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Sat May 01, 2021 4:11 pm

I use the shutdown command in a chron table entry:

Code: Select all

0 0 * * * /sbin/shutdown -r now 
It would be a similar affair with the reboot boot command, Try giving the reboot command a time, like now (0) so your command would look like:

Code: Select all

reboot -f 0
When you pass the argument "now" or zero minutes, the reboot sub-routine (via shutdown or reboot command) will execute the command regardless what is going on.

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Mon May 03, 2021 1:36 pm

drtechno wrote: Code: Select all

reboot -f 0
When you pass the argument "now" or zero minutes, the reboot sub-routine (via shutdown or reboot command) will execute the command regardless what is going on.
Tried this but the box is still un-rebooted. going to try /sbin/shutdown -r now next just in case but I'm pretty sure they are functional synonyms.

I'm surprised there's noting in dmesg about the failure to reboot. you'd think being stuck trying to reboot for 7 days would give some kind of error message.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Tue May 04, 2021 4:13 pm

trying "/sbin/shutdown -r now", no improvement, but actually I think passing the 0 to reboot -f may have caused the machine to fully crash. I'm sure it's a valid command but whatever state the machine was in was too fragile, I guess.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

drtechno
Posts: 243
Joined: Fri Apr 09, 2021 6:33 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Tue May 04, 2021 8:13 pm

alanbork wrote:
Tue May 04, 2021 4:13 pm
trying "/sbin/shutdown -r now", no improvement, but actually I think passing the 0 to reboot -f may have caused the machine to fully crash. I'm sure it's a valid command but whatever state the machine was in was too fragile, I guess.
Well, unless your drives are set up so it does not lazy write, it will over time crash the installation. It probably has some form of lazy write enabled.

Usually the command reboot by itself is commonly used in those cases, but if the system has certain programs are running, it will not reboot.

In those cases, with the condition of lazy write, the command

Code: Select all

init 6
will write out to the storage and reboot.

I would recommend that you should check your storage on a different linux machine by running fsck to make sure that the file system is healthy.

0 or a number is how many minutes to execute shutdown or reboot command, so 0 is the same as the "now" option.

If the system doesn't respond to any commands, the kernel system timer has crashed, and you have to physically turn off that machine.

Also, you should have at the end of any ssh script, or if you manually started a session, you should ALWAYS execute

Code: Select all

exit 
to close the session properly. Otherwise, port 22 will hang open and the system will be exposed due to ssh still being active.

Another thing you should always do because they are imaging distributing for the sd card, is to immediately regenerate the system keys since all distributed images would have the same key and be able to ssh into each other. I'm sure they pointed this out somewhere, but this is one of the downfalls of distributing an drive image vs setting up a system by a boot disk. To regenerate your ssh keys, and restart the service:

Code: Select all

sudo /bin/rm -v /etc/ssh/ssh_host_*
sudo dpkg-reconfigure openssh-server
sudo /etc/init.d/ssh restart
As far as overriding everything for a system reboot, I don't know if the "Magic SysRq key" functions pass through an ssh connection.

drtechno
Posts: 243
Joined: Fri Apr 09, 2021 6:33 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Tue May 04, 2021 10:21 pm

alanbork wrote:
Mon May 03, 2021 1:36 pm


Tried this but the box is still un-rebooted. going to try /sbin/shutdown -r now next just in case but I'm pretty sure they are functional synonyms.

I'm surprised there's noting in dmesg about the failure to reboot. you'd think being stuck trying to reboot for 7 days would give some kind of error message.
it will ignore the command if it assumes the ssh user is a user instead of root or part of sudo-ers. Try sudo in front of it especially if you didn't log in as root.

Also, the file you should be looking at to debug this is /var/log/kern.log which you can use the same method you've been doing, or other get-request/ftp calls

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Tue May 04, 2021 11:04 pm

there's no ssh here - these are all commands executed as root via a cron job.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

User avatar
thagrol
Posts: 4980
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Tue May 04, 2021 11:09 pm

drtechno wrote:
Tue May 04, 2021 8:13 pm
alanbork wrote:
Tue May 04, 2021 4:13 pm
trying "/sbin/shutdown -r now", no improvement, but actually I think passing the 0 to reboot -f may have caused the machine to fully crash. I'm sure it's a valid command but whatever state the machine was in was too fragile, I guess.
Well, unless your drives are set up so it does not lazy write, it will over time crash the installation. It probably has some form of lazy write enabled.
The default in RPiOS is to use the async mount option. That isn't a problem as reboot/shutdown/poweroff all perform a a clean shutodwn including syncing any write caches to disc.
Usually the command reboot by itself is commonly used in those cases, but if the system has certain programs are running, it will not reboot.
Any suggestions as to what those programs might be?
In those cases, with the condition of lazy write, the command

Code: Select all

init 6
will write out to the storage and reboot.
Will that work on systemd based OS? And does it really do anything that reboot/shutdown/poweroff doesn't?
I would recommend that you should check your storage on a different linux machine by running fsck to make sure that the file system is healthy.

0 or a number is how many minutes to execute shutdown or reboot command, so 0 is the same as the "now" option.

If the system doesn't respond to any commands, the kernel system timer has crashed, and you have to physically turn off that machine.

Also, you should have at the end of any ssh script, or if you manually started a session, you should ALWAYS execute

Code: Select all

exit 
to close the session properly. Otherwise, port 22 will hang open and the system will be exposed due to ssh still being active.
Untrue.

Port 22 is only used to establish the connection. The actual session runs on a different port.
Another thing you should always do because they are imaging distributing for the sd card, is to immediately regenerate the system keys since all distributed images would have the same key and be able to ssh into each other. I'm sure they pointed this out somewhere, but this is one of the downfalls of distributing an drive image vs setting up a system by a boot disk. To regenerate your ssh keys, and restart the service:

Code: Select all

sudo /bin/rm -v /etc/ssh/ssh_host_*
sudo dpkg-reconfigure openssh-server
sudo /etc/init.d/ssh restart
Again you're assuming and providing instructions for sysV init. RPiOS and most modern Linux use systemd. Under systemd the ssh server is restarted by

Code: Select all

sudo systemctl restart sshd
There is also no need for "/bin/rm". "rm" will work just as well.

While regenerating ssh keys is a good idea, it's not for the reason you think. Public/Private keys for ssh login are per user not per server. If a user has no ssh keys or doesn't have the apropriate client public key passwordless login is not possible.
As far as overriding everything for a system reboot, I don't know if the "Magic SysRq key" functions pass through an ssh connection.
GIven those are handled in the kernel's keyboard driver I very much doubt they'll be functional over ssh.
I'm a volunteer. Take me for granted or abuse my support and I will walk away

All advice given is based on my experience. it worked for me, it may not work for you.
Need help? https://github.com/thagrol/Guides

User avatar
thagrol
Posts: 4980
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Tue May 04, 2021 11:34 pm

@alanbork

I suspect something in your crontab entry or once nightly script is failing. Unfortunately without ssh access it's going to be difficult to work out what and why. You'll probably have to bite the bullet and perform a hard power off (i.e. pull the plug) if you can't go to the zero's location for console access.

Those sorts of failure won't show up in dmesg.

Capture both stdout and stderr from your cron job to somewhere that will survive a reboot (i.e. not to RAM and not to /tmp), let it run over night then check the log file.

A trivial example:

Code: Select all

0 0 * * * date > /home/pi/date.log 2>&1
In the longer term you may want to switch the OS partition to read only by enabling the overlayfs in raspi-config. You'll lose any changes across reboots but have almost zero chance of OS corruption.

While there will be a performance hit, you might also want to use the sync option on your external drives. That'll stop write buffering so will minimise data loss if a hard reboot/power off is needed.

One last question: when you set up your reboot cron job, did you do it in a normal user's crontab (e.g. the pi user) or root's? If you used a normal user, you must prefix the command with sudo and enabled password less sudo for the shutdown/poweroff/reboot command (whichever you're using) for that user.

Only root can run shutdown/poweroff/reboot.
I'm a volunteer. Take me for granted or abuse my support and I will walk away

All advice given is based on my experience. it worked for me, it may not work for you.
Need help? https://github.com/thagrol/Guides

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 1:19 am

thagrol wrote: > As far as overriding everything for a system reboot, I don't know if the "Magic SysRq key" functions pass through an ssh connection.
GIven those are handled in the kernel's keyboard driver I very much doubt they'll be functional over ssh.
There's a file system equivalent you can use, I think it's under proc. I haven't tried that route yet, so I can't confirm the details.
thagrol wrote: While there will be a performance hit, you might also want to use the sync option on your external drives. That'll stop write buffering so will minimise data loss if a hard reboot/power off is needed.
that's an excellent idea. The network pipe is going to be much slower than the hard drive so that should be a mostly invisible change. Is that an option controlled via fstab?
thagrol wrote: One last question: when you set up your reboot cron job, did you do it in a normal user's crontab (e.g. the pi user) or root's? If you used a normal user, you must prefix the command with sudo and enabled password less sudo for the shutdown/poweroff/reboot command (whichever you're using) for that user.
it's root's crontab. Since I'm no crontab expert I even double checked it, putting whoami in the nightly script wget'ed from my server, and it indeed is running everything as root. Whatever's wrong it's not permissions. Recall that the output of the last command:

Code: Select all

reboot system boot 4.14.34+ Wed Dec 31 16:00 still running
so the system started the reboot process but couldn't finish it.

I figured the shutdown procedure must have killed SSHD, but in fact it's still running according to systemctl:
ssh.service - OpenBSD Secure Shell server
Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-04-20 21:41:14 PDT; 1 weeks 4 days ago
Process: 420 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS)
Main PID: 477 (sshd)
CGroup: /system.slice/ssh.service
└─477 /usr/sbin/sshd -D

Apr 20 22:17:28 raspberrypi sshd[924]: Accepted password for alan from 70.95.60.201 port 57208 ssh2
Apr 20 22:17:28 raspberrypi sshd[924]: pam_unix(sshd:session): session opened for user alan by (uid=0)
Apr 21 09:12:29 raspberrypi sshd[4027]: error: Received disconnect from 70.95.60.201 port 58187:13: Unable to authenticate [preauth]
Apr 21 09:12:29 raspberrypi sshd[4027]: Disconnected from 70.95.60.201 port 58187 [preauth]
Apr 21 12:33:20 raspberrypi sshd[4688]: Accepted password for alan from 70.95.60.201 port 60947 ssh2
Apr 21 12:33:20 raspberrypi sshd[4688]: pam_unix(sshd:session): session opened for user alan by (uid=0)
Apr 21 12:34:18 raspberrypi sshd[4728]: Accepted password for alan from 70.95.60.201 port 60952 ssh2
Apr 21 12:34:18 raspberrypi sshd[4728]: pam_unix(sshd:session): session opened for user alan by (uid=0)
Apr 21 12:49:35 raspberrypi sshd[4872]: Accepted password for alan from 70.95.60.201 port 61058 ssh2
Apr 21 12:49:35 raspberrypi sshd[4872]: pam_unix(sshd:session): session opened for user alan by (uid=0)
Those logins are from before it stopped responding to external ssh connections.

for good measure I also tried sending it "systemctl restart ssh" but ssh still doesn't respond.

if sshd is running what else could keep it from responding to ssh logins?
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

User avatar
rpdom
Posts: 18714
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 5:13 am

alanbork wrote:
Wed May 05, 2021 1:19 am
if sshd is running what else could keep it from responding to ssh logins?
There's a few things. Firewall for one. Are there any iptables (or nftables) rules set on either system?

Have you tried connecting with ssh -vvv alan@raspberrypi ? That should give loads of output which may help diagnose the connection problem.
Unreadable squiggle

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 5:36 am

rpdom wrote: There's a few things. Firewall for one. Are there any iptables (or nftables) rules set on either system?
no. But the machine is behind an ISP-provided router (ahem, firewall) with a port-rewrite rule that redirects 44123 to 22 on my pi, and maybe that's relevant, see below.
rpdom wrote: Have you tried connecting with ssh -vvv alan@raspberrypi ? That should give loads of output which may help diagnose the connection problem.
here's what you get if you try to connect to the port that was being routed to the pi a week ago:
OpenSSH_7.9p1 Raspbian-10+deb10u2, OpenSSL 1.1.1d 10 Sep 2019
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: resolving x port 44123
debug2: ssh_connect_direct
debug1: Connecting to x port 44123.
debug1: connect to address x port 44123: No route to host
ssh: connect to host x port 44123: No route to host
and here's what happens when you use another port that's not routed anywhere and thus eaten by the router's firewall.
OpenSSH_7.9p1 Raspbian-10+deb10u2, OpenSSL 1.1.1d 10 Sep 2019
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: resolving "x" port 22
debug2: ssh_connect_direct
debug1: Connecting to x port 22.
debug1: connect to address x port 22: Connection timed out
ssh: connect to host x port 22: Connection timed out
could it be that the ISP is auto-blocking port 44123, and if so would that give the no route to host response? if they started blocking it, it was while I was using it; I was connected and backing up some data over sftp and was disconnected mid-file, and have never been able to connect since, although the machine ran for at least another week after that.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

Ernst
Posts: 1371
Joined: Sat Feb 04, 2017 9:39 am
Location: Germany

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 5:50 am

alanbork wrote:
Wed May 05, 2021 5:36 am
could it be that the ISP is auto-blocking port 44123, and if so would that give the no route to host response?
Are you sure that the remote IP-address has not changed ?
The road to insanity is paved with static ip addresses

User avatar
thagrol
Posts: 4980
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 10:09 am

Ernst wrote:
Wed May 05, 2021 5:50 am
alanbork wrote:
Wed May 05, 2021 5:36 am
could it be that the ISP is auto-blocking port 44123, and if so would that give the no route to host response?
Are you sure that the remote IP-address has not changed ?
Worth checking but if that were the case I'd expect booth logs to show the same error. As it is, it looks to me like the remote router can be reached but something has changed beyond it that is preventing port forwarding from working.

@alanbork
I'm assuming you've configured a port forwarding rule on the remote router.
  1. Does that rule use the IP address or MAC address of the target machine?
  2. Has the remote router been rebooted?
  3. Has the IP address of the zero changed?
  4. Has the zero been swapped out?
  5. If using a USB network adapter with the zero, has that been swapped out?
If that does turn out to be the problem, I suggest you set a static IP address on the zero (outside the DHCP range of the router but in the same subnet) and adjust your port forwarding rule to use that IP address.

That way, if you do later swap out the hardware things should still work.
I'm a volunteer. Take me for granted or abuse my support and I will walk away

All advice given is based on my experience. it worked for me, it may not work for you.
Need help? https://github.com/thagrol/Guides

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 5:03 pm

Ernst wrote:
Wed May 05, 2021 5:50 am
alanbork wrote:
Wed May 05, 2021 5:36 am
could it be that the ISP is auto-blocking port 44123, and if so would that give the no route to host response?
Are you sure that the remote IP-address has not changed ?
yes.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 5:20 pm

thagrol wrote:
Wed May 05, 2021 10:09 am
I'm assuming you've configured a port forwarding rule on the remote router.
  1. Does that rule use the IP address or MAC address of the target machine?
  2. Has the remote router been rebooted?
  3. Has the IP address of the zero changed?
  4. Has the zero been swapped out?
  5. If using a USB network adapter with the zero, has that been swapped out?
If that does turn out to be the problem, I suggest you set a static IP address on the zero (outside the DHCP range of the router but in the same subnet) and adjust your port forwarding rule to use that IP address.

That way, if you do later swap out the hardware things should still work.
yes the port forwarding is on the router. it's an ATT uverse unit so it's kind of feature limited, but it's been working ok for at least months at a time. The pi0w gets it's ip address from dhcp based on it's MAC, and the port forwarding is ip based. I don't think there's a static option. no hardware changes have happened to pi or router, and pi is using built in wifi. when this has happened in the past, a hard power cycle to the pi fixes it. I'll have to wait a day or two before the house owner can provide that service to me, but it's no longer wgetting the nightly cmd script so I think it's hard crashed now.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

User avatar
thagrol
Posts: 4980
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Wed May 05, 2021 7:10 pm

alanbork wrote:
Wed May 05, 2021 5:20 pm
yes the port forwarding is on the router. it's an ATT uverse unit so it's kind of feature limited
I'm not familiar with that one. Try looking fro the DHCP server settings then something like address reservation.

Or set it locally on the zero in /etc/dhcpcd.conf. There should be a worked example in there. Or see https://www.raspberrypi.org/documentati ... /README.md
I'm a volunteer. Take me for granted or abuse my support and I will walk away

All advice given is based on my experience. it worked for me, it may not work for you.
Need help? https://github.com/thagrol/Guides

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Thu May 06, 2021 2:14 pm

thagrol wrote:
Wed May 05, 2021 7:10 pm

I'm not familiar with that one. Try looking fro the DHCP server settings then something like address reservation.
that's not really the issue here - the pi and router have worked fine for many months and across many reboots, each reboot getting a different IP inside the firewall. that aspect of the ATT uverse router seems to work right, at least. stability hasn't been great lately, though, with the pi dying randomly after only 1 to 10 days of being powered on.

I suspect that the os image on the SD card has gotten corrupted, as I had it hard power cycled last night but it hasn't come back up and pinged out. I'll have to go collect it and see what's up. which log files are most likely to reveal what happened in it's last moments of life?
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

User avatar
thagrol
Posts: 4980
Joined: Fri Jan 13, 2012 4:41 pm
Location: Darkest Somerset, UK
Contact: Website

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Thu May 06, 2021 4:07 pm

alanbork wrote:
Thu May 06, 2021 2:14 pm
I suspect that the os image on the SD card has gotten corrupted, as I had it hard power cycled last night but it hasn't come back up and pinged out. I'll have to go collect it and see what's up. which log files are most likely to reveal what happened in it's last moments of life?
OS corruption is certainly a posibility as is SD card failure. Another s that the SD card is full, that'll cause all sorts of problems as it won't be possible form various temporary files to be written.

It's easy to accidentally fill up the SD card, especially if the HDD mount fails and you continue to write to its mount point.

As for which log file(s), that's hard to say for certain. Start with /var/log/syslog
I'm a volunteer. Take me for granted or abuse my support and I will walk away

All advice given is based on my experience. it worked for me, it may not work for you.
Need help? https://github.com/thagrol/Guides

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Sun May 09, 2021 3:40 am

so the reason why it disappeared from the network most recently was that the router it was connected to had it's passwords changed. And nobody told me, of course.

that doesn't explain why it stopped accepting ssh requests last month or refused to reboot more recently, though. but what it does mean is I have access to the machine again if you have any ideas of what I can look for in the logs that would reveal why ssh wouldn't accept incoming connections. I looked at /var/log/syslog, nothing relevant there other than the recent inability to bring wlan0 up because of the bad password.

I've switched the external disk drive to be mounted with sync, at your suggestion.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

epoch1970
Posts: 6505
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Sun May 09, 2021 8:37 am

alanbork wrote:
Sun May 09, 2021 3:40 am
that doesn't explain why it stopped accepting ssh requests last month or refused to reboot more recently, though.
Unlikely causes:
- kernel crash (reboot failing)
- SD bit rot or corruption affecting sshd_config (ssh not working)
- Successful attack launched from the wireless network. Maybe there was a reason for the password change.

Likely cause: somehow the AP stopped providing satisfactory service on site. Your Pi, among other machines, was stranded off connection; and while still operating ok it appeared dead from the exterior. So the AP was changed, and the password with it. Situation was restored, except for your Pi.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Sun May 09, 2021 5:49 pm

bit rot it is:
[21567.005204] EXT4-fs error (device mmcblk0p2): ext4_ext_check_inode:510: inode #37764: comm dpkg: pblk 0 bad header/extent: invalid eh_entries - magic f30a, entries 2049, max 4(4), depth 0(0)
[21567.473575] EXT4-fs error (device mmcblk0p2): ext4_ext_check_inode:510: inode #37764: comm dpkg: pblk 0 bad header/extent: invalid eh_entries - magic f30a, entries 2049, max 4(4), depth 0(0)
[22058.862609] EXT4-fs error (device mmcblk0p2): ext4_ext_check_inode:510: inode #37764: comm dpkg: pblk 0 bad header/extent: invalid eh_entries - magic f30a, entries 2049, max 4(4), depth 0(0)
[22059.274447] EXT4-fs error (device mmcblk0p2): ext4_ext_check_inode:510: inode #37764: comm dpkg: pblk 0 bad header/extent: invalid eh_entries - magic f30a, entries 2049, max 4(4), depth 0(0)
or I suppose just bad data written to the sd card, no way to check without physical access to the pi, right?

since it's going to have to "recover" as best it can unattended, I've made these changes to /boot/cmdline.txt:

dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=e57bd46b-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait fsck.mode=force fsck.repair=yes

look "good"? it seems like eventually I'm going to have to mail out a new SD card, which sounds like a royal pain to configure properly without the actual hardware in front of me.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

alanbork
Posts: 195
Joined: Thu Apr 23, 2020 11:18 pm

Re: reboot still running: why won't my pi0 reboot? I've waited days!

Thu May 27, 2021 6:18 pm

alanbork wrote: since it's going to have to "recover" as best it can unattended, I've made these changes to /boot/cmdline.txt:

dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=e57bd46b-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait fsck.mode=force fsck.repair=yes
Sadly, I just had a hard-power cycle and the file system is still damaged even with these command line options intended to force as much repair as could be generated automatically.

Not sure how many files are damaged since it does boot, but for instance,
# wc
bash: /usr/bin/wc: cannot execute binary file: Exec format error
logs don't really tell much about what if anything fsck did:
May 26 23:25:40 raspberrypi systemd[1]: Found device /dev/disk/by-partuuid/e57bd46b-01.
May 26 23:25:40 raspberrypi systemd[1]: Starting File System Check on /dev/disk/by-partuuid/e57bd46b-01...
May 26 23:25:40 raspberrypi mtp-probe: checking bus 1, device 2: "/sys/devices/platform/soc/20980000.usb/usb1/1-1"
May 26 23:25:40 raspberrypi mtp-probe: bus: 1, device: 2 was not an MTP device
May 26 23:25:40 raspberrypi systemd-fsck[169]: fsck.fat 4.1 (2017-01-24)
May 26 23:25:40 raspberrypi systemd-fsck[169]: 0x41: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt.
May 26 23:25:40 raspberrypi systemd-fsck[169]: Automatically removing dirty bit.
May 26 23:25:40 raspberrypi systemd-fsck[169]: Performing changes.
May 26 23:25:40 raspberrypi systemd-fsck[169]: /dev/mmcblk0p1: 164 files, 44276/86872 clusters
May 26 23:25:40 raspberrypi systemd[1]: Started File System Check on /dev/disk/by-partuuid/e57bd46b-01.
May 26 23:25:40 raspberrypi systemd[1]: Mounting /boot...
May 26 23:25:40 raspberrypi systemd[1]: Mounted /boot.
retired neuroscientist. raspberry pi hacking and monitor input lag methods: https://alantechreview.blogspot.com/

Return to “Troubleshooting”