ziesemer
Posts: 12
Joined: Wed May 13, 2015 2:20 am
Location: Appleton, WI
Contact: Website

Raspbian system-wide hang after using “reset” and logging-of

Sat May 13, 2017 4:10 pm

I'm looking for an explanation and/or a solution for resolving an issue when connecting to the Pi by its serial console connection: After using the "reset" command at any point during a session - a subsequent log-off causes the entire system to hang (at least systemd).

I can reproduce this on multiple units - including a Pi 2 and a Pi 3. I'm using a clean install of "2016-05-27-raspbian-jessie-lite.img" (just re-rested with "2017-04-10-raspbian-jessie-lite.img") - and can reproduce without any updates, as well as with all updates applied.

I was / still am suspecting a systemd-related issue here. I've tried repeating the same with both Debian Jessie and CentOS 7 within VirtualBox, and have not been able to reproduce - so this appears to be Raspbian-specific.

Can anyone else at least reproduce the following?
  1. Use of a Raspberry Pi 2 or 3 Model B, using the May 2016 or later version of Raspbian Jessie Lite.
  2. The serial console enabled and connected, as per http://elinux.org/RPi_Serial_Connection. Note that the Pi 3 requires adding enable_uart=1 at the end of /boot/config.txt.
  3. Using a separate SSH or tty0 (keyboard/monitor) console, validate that any or all of the following complete successfully and immediately:

    Code: Select all

    systemctl status --no-pager
    systemctl --no-pager
    systemctl status [email protected]
    Also observe using ps -Afl|grep agetty, that the following is running on ttyS0:

    Code: Select all

    /sbin/agetty --keep-baud 115200 38400 9600 ttyS0 vt102
  4. Using the serial console (ttyS0):
    1. Log-in and out ("exit") repeatedly, observing no changes in the above.
    2. Log-in, then issue the "reset" command. Observe still no changes in system status / operation.
    3. "exit". Observe that the command hangs indefinitely and never returns.
  5. Using a separate SSH or tty0 console (still open from above), attempt to repeat any of the above systemctl status commands. Observe that all queries to systemctl now fail after timing out:

    Code: Select all

    [email protected]:~ $ systemctl status --no-pager
    Failed to read server status: Connection timed out
    [email protected]:~ $ systemctl --no-pager
    Failed to list units: Connection timed out
    [email protected]:~ $ systemctl status [email protected]
    Failed to get properties: Connection timed out
    Also observe that any new login attempts also time out.
Further details at https://raspberrypi.stackexchange.com/q ... -serial-co . Even after applying all the latest updates (including the 4.9.24-v7+ #993 kernel), the problem still remains.

Why is this happening, and how can this be fixed? Sure, I could just refrain from running reset from a serial console - but I don't see any reason why this should be causing a problem, and it is rather disastrous to the system when executed (as demonstrated above). I'm not exactly sure what the next debugging steps would be for this, but will certainly provide any additional outputs or test results that may be requested.

ziesemer
Posts: 12
Joined: Wed May 13, 2015 2:20 am
Location: Appleton, WI
Contact: Website

Re: Raspbian system-wide hang after using “reset” and logging-of

Sun Aug 20, 2017 2:01 am

This is no longer reproducible with Raspbian Stretch, released 2017-08-16.

As this provides the first bump in the systemd version since I reported this (was previously held at version 215, now 232) - it seems pretty clear that this was an issue in systemd that is now resolved.

Current version outputs on a working system:

Code: Select all

[email protected]:~$ uname -a
Linux raspberrypi 4.9.41-v7+ #1023 SMP Tue Aug 8 16:00:15 BST 2017 armv7l GNU/Linux

[email protected]:~$ systemd --version
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

[email protected]:~$ cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

User avatar
jojopi
Posts: 3060
Joined: Tue Oct 11, 2011 8:38 pm

Re: Raspbian system-wide hang after using “reset” and logging-of

Sun Aug 20, 2017 4:19 am

Very interesting.

The specific thing that "reset" does that triggers the problem is "stty -clocal". You can wake systemd back up after the hang by running from another terminal "sudo stty -F /dev/ttyAMA0 clocal" (or your device name).

strace shows that after logout, systemd pid1 normally runs:

Code: Select all

open("/dev/ttyAMA0", O_RDWR|O_NOCTTY|O_LARGEFILE|O_CLOEXEC) = 15
ioctl(15, TCGETS, {B115200 opost isig icanon echo ...}) = 0
ioctl(15, TIOCVHANGUP, 0)               = 0
close(15)                               = 0
But when the terminal is not clocal, the open never completes and systemd is hung. That agrees with the terminal documentation (info libc):

Code: Select all

 -- Macro: tcflag_t CLOCAL
     If this bit is set, it indicates that the terminal is connected
     "locally" and that the modem status lines (such as carrier detect)
     should be ignored.

     On many systems if this bit is not set and you call 'open' without
     the 'O_NONBLOCK' flag set, 'open' blocks until a modem connection
     is established.
(Of course there is no way to assert carrier detect on the Pi's serial port.)

I have not tested stretch, but I presume that systemd now correctly opens terminals in non-blocking mode.

Incidentally when reboot also hangs, you can still boot using Alt+SysRq-b on console or Break-b on serial.

Return to “Advanced users”