Hi
I have a zero Buster running RPI Cam Control and doing some GPIO reading with a python script. I noticed it crashes a couple of times a week, so I have activated the watchdog. From comparing logs at the times of the crashes It is now apparent that the crashes occur after the following line in syslog, every time:
Aug 20 09:17:01 pingpong CRON[28309]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
which is followed by;
Aug 20 09:17:55 pingpong systemd-modules-load[68]: Module 'bcm2835_wdt' is builtin
Aug 20 09:17:55 pingpong systemd-modules-load[68]: Module 'bcm2835_wdt' is builtin
Aug 20 09:17:55 pingpong fake-hwclock[76]: Tue 20 Aug 09:17:01 UTC 2019
which is the system booting again. The time (approx 45 seconds) can be partially explained by the 15 seconds watchdog delay, but what happens for the remaining 30 I have no clue.
What does catch my attention is the fake-hwclock line, as in the /etc/cron.hourly/ is the fake-hwclock script. Running sudo fake-hwclock save does not crash the pi.
I hope somebody recognises this and can tell me what the cause of this problem is, thanks in advance,
Efjee
Re: Pi zero crashes after cron.hourly
What else is in cron.hourly?
You can summon run-parts in test mode, it will list the files it wants to execute.
You can summon run-parts in test mode, it will list the files it wants to execute.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel
Re: Pi zero crashes after cron.hourly
Hi
fake-hwclock
is the only thing in there.
Testing it runs fine.
Efjee
fake-hwclock
is the only thing in there.
Testing it runs fine.
Efjee
Re: Pi zero crashes after cron.hourly
It crashed again around midnight.
Aug 21 00:24:00 pingpong rngd[311]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Aug 21 00:39:01 pingpong CRON[15922]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Aug 21 00:39:27 pingpong systemd[1]: Starting Clean php session files...
Aug 21 00:39:29 pingpong systemd[1]: phpsessionclean.service: Succeeded.
Aug 21 00:39:29 pingpong systemd[1]: Started Clean php session files.
Aug 21 00:25:04 pingpong systemd-modules-load[87]: Module 'bcm2835_wdt' is builtin
Aug 21 00:25:04 pingpong systemd-modules-load[87]: Module 'bcm2835_wdt' is builtin
Aug 21 00:25:04 pingpong fake-hwclock[71]: Wed 21 Aug 00:17:01 UTC 2019
Notice how the time jumps back at boot. I have a python script running which keeps an eye on things, which notices it too. (it writes the current date/time to a file 'last_alive' every 25 seconds)
0 2019-08-21 00:25:21.907731 System up
0 2019-08-21 00:25:21.935923 Sucessfully read last_alive: 21 08 2019 00:57
0 2019-08-21 00:25:22.152511 Last alive is in the future, power failure or watchdog?!: 21 08 2019 00:57
0 2019-08-21 00:25:22.176343 Unexpected system up time, system crash or reboot?
34 2019-08-21 00:59:46.301065 time change detected; old time, seconds: 2019-08-21 00:25:21.926313, 2064.373767
150 2019-08-21 02:55:24.167758 planned shutdown
0 2019-08-21 02:56:10.377447 System up
0 2019-08-21 02:56:10.393866 Sucessfully read last_alive: 21 08 2019 02:54
0 2019-08-21 02:56:10.527730 Sucessfully read restart_invoked: 21 08 2019 02:55
0 2019-08-21 02:56:10.530850 Succesfully went through shutdown/startup loop
4 2019-08-21 03:01:08.303060 time change detected; old time, seconds: 2019-08-21 02:56:10.391818, 297.910351
4 2019-08-21 03:01:08.308959 Restart time within expectations, actual restart took (in seconds) 368.308617
Aug 21 00:24:00 pingpong rngd[311]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Aug 21 00:39:01 pingpong CRON[15922]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Aug 21 00:39:27 pingpong systemd[1]: Starting Clean php session files...
Aug 21 00:39:29 pingpong systemd[1]: phpsessionclean.service: Succeeded.
Aug 21 00:39:29 pingpong systemd[1]: Started Clean php session files.
Aug 21 00:25:04 pingpong systemd-modules-load[87]: Module 'bcm2835_wdt' is builtin
Aug 21 00:25:04 pingpong systemd-modules-load[87]: Module 'bcm2835_wdt' is builtin
Aug 21 00:25:04 pingpong fake-hwclock[71]: Wed 21 Aug 00:17:01 UTC 2019
Notice how the time jumps back at boot. I have a python script running which keeps an eye on things, which notices it too. (it writes the current date/time to a file 'last_alive' every 25 seconds)
0 2019-08-21 00:25:21.907731 System up
0 2019-08-21 00:25:21.935923 Sucessfully read last_alive: 21 08 2019 00:57
0 2019-08-21 00:25:22.152511 Last alive is in the future, power failure or watchdog?!: 21 08 2019 00:57
0 2019-08-21 00:25:22.176343 Unexpected system up time, system crash or reboot?
34 2019-08-21 00:59:46.301065 time change detected; old time, seconds: 2019-08-21 00:25:21.926313, 2064.373767
150 2019-08-21 02:55:24.167758 planned shutdown
0 2019-08-21 02:56:10.377447 System up
0 2019-08-21 02:56:10.393866 Sucessfully read last_alive: 21 08 2019 02:54
0 2019-08-21 02:56:10.527730 Sucessfully read restart_invoked: 21 08 2019 02:55
0 2019-08-21 02:56:10.530850 Succesfully went through shutdown/startup loop
4 2019-08-21 03:01:08.303060 time change detected; old time, seconds: 2019-08-21 02:56:10.391818, 297.910351
4 2019-08-21 03:01:08.308959 Restart time within expectations, actual restart took (in seconds) 368.308617
Re: Pi zero crashes after cron.hourly
Then something else is causing it to crash at that time.
The "time going backwards" thing is a normal consequence of the Pi stopping abnormally. It doesn't have a battery powered Real Time Clock like a PC does, so when it turns off it loses the current time and date. It will get the current time/date from the internet time servers during the boot, but initially it works to the last time it "remembers" - which is stored at 17 minutes past each hour from the cron.hourly routines (that's the "fake-hwclock" program) and on a clean shutdown/reboot.
As your time "jumped back" to 00:17:01, I will assume the cron.hourly for fake-hwclock worked.
So something else after that is causing the issue.
The "time going backwards" thing is a normal consequence of the Pi stopping abnormally. It doesn't have a battery powered Real Time Clock like a PC does, so when it turns off it loses the current time and date. It will get the current time/date from the internet time servers during the boot, but initially it works to the last time it "remembers" - which is stored at 17 minutes past each hour from the cron.hourly routines (that's the "fake-hwclock" program) and on a clean shutdown/reboot.
As your time "jumped back" to 00:17:01, I will assume the cron.hourly for fake-hwclock worked.
So something else after that is causing the issue.
Unreadable squiggle
Re: Pi zero crashes after cron.hourly
That's what I'm thinking. But how do I find out? There is nothing in the logs anywhere after that..
Re: Pi zero crashes after cron.hourly
Well, your old time was 2019-08-21 00:25:21.926313, so it looks like something that runs at about 25 minutes past midnight - or is triggered earlier, but causes the crash at about 25 past.
It does sound like something that runs from a cron or crontab file. Identifying it could be tricky.
Can you modify your timestamping program to log what was running at the time as well, using "ps -efl" and log that to a file. Maybe that will catch the problem.
Ideally log to severl files, one per minute between 00:15 and 00:30 should be sufficient.
It does sound like something that runs from a cron or crontab file. Identifying it could be tricky.
Can you modify your timestamping program to log what was running at the time as well, using "ps -efl" and log that to a file. Maybe that will catch the problem.
Ideally log to severl files, one per minute between 00:15 and 00:30 should be sufficient.
Unreadable squiggle
Re: Pi zero crashes after cron.hourly
The thing is, it doesn't occur on the same time of day. It does always occur after the cron.hourly job though.
I'll keep copies of ps -efl, that's a good idea. They're not that big and I'll save them timestamped.
You'll hear from me after the next crash.
Thanks
Efjee
I'll keep copies of ps -efl, that's a good idea. They're not that big and I'll save them timestamped.
You'll hear from me after the next crash.
Thanks
Efjee
Re: Pi zero crashes after cron.hourly
But not immediately after. That job runs every hour, so yes, something that crashes the Pi will always happen at sometime after the most recent run of cron.hourly.
I suspect if you disabled cron.hourly in /etc/crontab you will still get the crashes.
Unreadable squiggle
Re: Pi zero crashes after cron.hourly
Good point! Hadn't thought of it that way. It's been stable and faithfully logging ever since my last post so I'll have to see what happens.
Efjee
Re: Pi zero crashes after cron.hourly
It Crashed again at 19:17 (from the syslog) but in reality it was about 19:28:xx
So now I have quite a lot of logs to compare, what would be the best thing to look for?
syslog:
Aug 24 19:14:20 pingpong rngd[262]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Aug 24 19:17:01 pingpong CRON[16715]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Aug 24 19:17:21 pingpong systemd-modules-load[68]: Module 'bcm2835_wdt' is builtin
Aug 24 19:17:21 pingpong systemd-modules-load[68]: Module 'bcm2835_wdt' is builtin
Aug 24 19:17:21 pingpong fake-hwclock[73]: Sat 24 Aug 19:17:01 UTC 2019
Aug 24 19:17:21 pingpong systemd-fsck[93]: e2fsck 1.44.5 (15-Dec-2018)
The last dump made by my script. One is made every 25 seconds. It attempted one more at 19:28:06 but it's blank. I think it's safe to say it crashed within 25 seconds after that.
So now I have quite a lot of logs to compare, what would be the best thing to look for?
syslog:
Aug 24 19:14:20 pingpong rngd[262]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Aug 24 19:17:01 pingpong CRON[16715]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Aug 24 19:17:21 pingpong systemd-modules-load[68]: Module 'bcm2835_wdt' is builtin
Aug 24 19:17:21 pingpong systemd-modules-load[68]: Module 'bcm2835_wdt' is builtin
Aug 24 19:17:21 pingpong fake-hwclock[73]: Sat 24 Aug 19:17:01 UTC 2019
Aug 24 19:17:21 pingpong systemd-fsck[93]: e2fsck 1.44.5 (15-Dec-2018)
The last dump made by my script. One is made every 25 seconds. It attempted one more at 19:28:06 but it's blank. I think it's safe to say it crashed within 25 seconds after that.
Code: Select all
24-08-2019 19:27:40 up 193 temp: 55.1 system load: 0.37 0.42 0.34
total used free shared buff/cache available
Mem: 378140 63664 186636 7532 127840 254760
Swap: 102396 0 102396
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S root 1 0 0 80 0 - 8425 do_epo 16:13 ? 00:00:08 /sbin/init
1 S root 2 0 0 80 0 - 0 kthrea 16:13 ? 00:00:00 [kthreadd]
1 I root 4 2 0 60 -20 - 0 worker 16:13 ? 00:00:03 [kworker/0:0H-mmc_complete]
1 I root 6 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [mm_percpu_wq]
1 S root 7 2 0 80 0 - 0 smpboo 16:13 ? 00:00:44 [ksoftirqd/0]
5 S root 8 2 0 80 0 - 0 devtmp 16:13 ? 00:00:00 [kdevtmpfs]
1 I root 9 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [netns]
1 S root 11 2 0 80 0 - 0 watchd 16:13 ? 00:00:00 [khungtaskd]
1 S root 12 2 0 80 0 - 0 oom_re 16:13 ? 00:00:00 [oom_reaper]
1 I root 13 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [writeback]
1 S root 14 2 0 80 0 - 0 kcompa 16:13 ? 00:00:00 [kcompactd0]
1 I root 15 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [crypto]
1 I root 16 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [kblockd]
1 S root 17 2 0 -40 - - 0 kthrea 16:13 ? 00:00:00 [watchdogd]
1 I root 18 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [rpciod]
1 I root 19 2 0 60 -20 - 0 worker 16:13 ? 00:00:00 [kworker/u3:0-hci0]
1 I root 20 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [xprtiod]
1 S root 23 2 0 80 0 - 0 kswapd 16:13 ? 00:00:00 [kswapd0]
1 I root 24 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [nfsiod]
1 I root 35 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [kthrotld]
1 I root 36 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [iscsi_eh]
1 I root 37 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [dwc_otg]
1 I root 38 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [DWC Notificatio]
1 S root 39 2 0 61 -19 - 0 down_i 16:13 ? 00:01:35 [vchiq-slot/0]
1 S root 40 2 0 61 -19 - 0 down_i 16:13 ? 00:00:04 [vchiq-recy/0]
1 S root 41 2 0 60 -20 - 0 down_i 16:13 ? 00:00:00 [vchiq-sync/0]
1 S root 42 2 0 80 0 - 0 vchiq_ 16:13 ? 00:00:00 [vchiq-keep/0]
1 S root 43 2 0 70 -10 - 0 down_i 16:13 ? 00:00:00 [SMIO]
1 S root 44 2 0 9 - - 0 irq_th 16:13 ? 00:00:00 [irq/86-mmc1]
1 I root 47 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [mmc_complete]
1 S root 51 2 0 80 0 - 0 kjourn 16:13 ? 00:00:00 [jbd2/mmcblk0p2-]
1 I root 52 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [ext4-rsv-conver]
1 I root 53 2 0 60 -20 - 0 rescue 16:13 ? 00:00:00 [ipv6_addrconf]
4 S root 68 1 0 80 0 - 6481 do_epo 16:14 ? 00:00:03 /lib/systemd/systemd-journald
4 S root 108 1 0 80 0 - 4535 do_epo 16:14 ? 00:00:01 /lib/systemd/systemd-udevd
1 S root 119 2 0 70 -10 - 0 vc_sm_ 16:14 ? 00:00:00 [SMIO]
1 I root 127 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [mmal-vchiq]
1 I root 129 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [mmal-vchiq]
1 I root 135 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [mmal-vchiq]
1 I root 136 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [mmal-vchiq]
1 I root 137 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [mmal-vchiq]
1 I root 160 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [cfg80211]
1 I root 164 2 0 60 -20 - 0 rescue 16:14 ? 00:00:00 [brcmf_wq/mmc1:0]
1 S root 165 2 0 80 0 - 0 brcmf_ 16:14 ? 00:00:00 [brcmf_wdog/mmc1]
1 I root 184 2 0 80 0 - 0 worker 16:14 ? 00:00:01 [kworker/u2:2-events_unbound]
4 S systemd+ 202 1 0 80 0 - 5588 do_epo 16:14 ? 00:00:01 /lib/systemd/systemd-timesyncd
4 S root 248 1 0 80 0 - 6151 poll_s 16:14 ? 00:00:00 /usr/sbin/rsyslogd -n -iNONE
4 S root 251 1 0 80 0 - 6904 do_epo 16:14 ? 00:00:00 /usr/sbin/cupsd -l
4 S root 252 1 0 99 19 - 919 poll_s 16:14 ? 00:00:00 /usr/sbin/alsactl -E HOME=/run/alsa -s -n 19 -c rdaemon
1 S root 262 1 0 80 0 - 6910 hrtime 16:14 ? 00:00:03 /usr/sbin/rngd -r /dev/hwrng
4 S root 270 1 0 80 0 - 1989 hrtime 16:14 ? 00:00:00 /usr/sbin/cron -f
4 S avahi 276 1 0 80 0 - 1439 poll_s 16:14 ? 00:00:00 avahi-daemon: running [pingpong.local]
4 S root 282 1 0 80 0 - 3245 do_epo 16:14 ? 00:00:00 /lib/systemd/systemd-logind
4 S nobody 288 1 0 80 0 - 1076 poll_s 16:14 ? 00:00:00 /usr/sbin/thd --triggers /etc/triggerhappy/triggers.d/ --socket /run/thd.socket --user nobody --deviceglob /dev/input/event*
4 S message+ 292 1 0 80 0 - 1636 do_epo 16:14 ? 00:00:01 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
1 S avahi 296 276 0 80 0 - 1439 unix_s 16:14 ? 00:00:00 avahi-daemon: chroot helper
4 S root 309 1 0 80 0 - 2671 poll_s 16:14 ? 00:00:00 /sbin/wpa_supplicant -u -s -O /run/wpa_supplicant
4 S root 330 1 0 80 0 - 10075 poll_s 16:14 ? 00:00:00 /usr/sbin/cups-browsed
5 S root 344 1 0 80 0 - 2769 poll_s 16:14 ? 00:00:00 wpa_supplicant -B -c/etc/wpa_supplicant/wpa_supplicant.conf -iwlan0 -Dnl80211,wext
1 I root 368 2 0 60 -20 - 0 worker 16:14 ? 00:00:00 [kworker/u3:1-hci0]
1 S root 369 1 0 80 0 - 531 poll_s 16:14 ? 00:00:00 /usr/bin/hciattach /dev/serial1 bcm43xx 3000000 flow - b8:27:eb:14:9a:0c
4 S root 372 1 0 80 0 - 2380 poll_s 16:14 ? 00:00:00 /usr/lib/bluetooth/bluetoothd
5 S root 427 1 0 80 0 - 720 do_epo 16:14 ? 00:00:00 /sbin/dhcpcd -q -w
4 S root 443 1 0 80 0 - 1389 inet_c 16:14 ? 00:00:00 /usr/sbin/vsftpd /etc/vsftpd.conf
4 S root 446 1 2 80 0 - 53781 poll_s 16:14 ? 00:05:41 /usr/bin/python3 /usr/bin/fail2ban-server -xf start
4 S root 448 1 0 80 0 - 2667 poll_s 16:14 ? 00:00:00 /usr/sbin/sshd -D
4 S root 449 1 0 80 0 - 4239 do_wai 16:14 ? 00:00:16 /usr/bin/python /home/pi/control/control.py
4 S root 450 1 0 80 0 - 3136 poll_s 16:14 ? 00:01:23 /usr/bin/python /home/pi/control/smart.py
1 S root 455 1 0 80 0 - 47886 poll_s 16:14 ? 00:00:02 /usr/sbin/apache2 -k start
5 S www-data 456 455 0 80 0 - 47910 inet_c 16:14 ? 00:00:00 /usr/sbin/apache2 -k start
5 S www-data 457 455 0 80 0 - 47896 inet_c 16:14 ? 00:00:00 /usr/sbin/apache2 -k start
5 S www-data 458 455 0 80 0 - 47910 inet_c 16:14 ? 00:00:00 /usr/sbin/apache2 -k start
5 S www-data 459 455 0 80 0 - 47904 inet_c 16:14 ? 00:00:00 /usr/sbin/apache2 -k start
5 S www-data 460 455 0 80 0 - 47896 inet_c 16:14 ? 00:00:00 /usr/sbin/apache2 -k start
4 S www-data 484 1 0 80 0 - 3635 do_epo 16:14 ? 00:00:00 /lib/systemd/systemd --user
5 S www-data 487 484 0 80 0 - 4200 sys_rt 16:14 ? 00:00:00 (sd-pam)
0 S www-data 505 1 0 80 0 - 890 do_wai 16:14 ? 00:00:00 raspimjpeg
1 S www-data 508 505 6 80 0 - 24026 hrtime 16:14 ? 00:11:38 raspimjpeg
0 S www-data 610 1 0 80 0 - 15004 hrtime 16:14 ? 00:01:26 php /var/www/html/schedule.php
4 S root 619 1 0 80 0 - 1073 poll_s 16:14 tty1 00:00:00 /sbin/agetty -o -p -- \u --noclear tty1 linux
5 S root 627 1 0 58 - - 480 hrtime 16:14 ? 00:00:05 /usr/sbin/watchdog
5 S www-data 1081 455 0 80 0 - 47904 inet_c 16:18 ? 00:00:00 /usr/sbin/apache2 -k start
1 I root 15852 2 0 80 0 - 0 worker 19:07 ? 00:00:00 [kworker/0:0-events_power_efficient]
1 I root 16819 2 0 60 -20 - 0 worker 19:18 ? 00:00:00 [kworker/0:2H]
1 I root 16893 2 0 80 0 - 0 worker 19:18 ? 00:00:00 [kworker/0:1-events]
1 I root 17066 2 0 80 0 - 0 worker 19:20 ? 00:00:00 [kworker/u2:0-flush-179:0]
1 I root 17297 2 0 60 -20 - 0 worker 19:23 ? 00:00:00 [kworker/0:1H]
1 I root 17431 2 0 80 0 - 0 worker 19:25 ? 00:00:00 [kworker/0:2-events_power_efficient]
4 S root 17523 448 0 80 0 - 3042 poll_s 19:26 ? 00:00:00 sshd: root [priv]
5 S sshd 17525 17523 0 80 0 - 2693 poll_s 19:26 ? 00:00:00 sshd: root [net]
4 S root 17526 448 0 80 0 - 3042 poll_s 19:26 ? 00:00:00 sshd: unknown [priv]
5 S sshd 17527 17526 0 80 0 - 2693 poll_s 19:26 ? 00:00:00 sshd: unknown [net]
1 I root 17530 2 0 80 0 - 0 worker 19:26 ? 00:00:00 [kworker/u2:1]
0 S root 17630 449 0 80 0 - 481 do_wai 19:27 ? 00:00:00 sh -c ps -efl >> /home/pi/control/efl/20190824_192740_193_efl.log
0 R root 17631 17630 0 80 0 - 2443 - 19:27 ? 00:00:00 ps -efl
Last edited by Florisjan on Mon Aug 26, 2019 11:04 am, edited 1 time in total.
Re: Pi zero crashes after cron.hourly
Fail2ban and raspimjpeg work the most.
I’d suspect one of these.
I’d suspect one of these.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel
Re: Pi zero crashes after cron.hourly
I was hesitant to do an upgrade, because I won't be near the pi for another few days, but did it anyway because these crashed won't do it a lot of good either. Ever since it hasn't crashed. Fingers crossed.