hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 1:57 pm

Hi all,

We are running raspbian stretch and have our mono based application running fine however we've found that after several days (this can be 2 days it could be 7) that our PI's just hang.

Kernel: Linux raspberrypi 4.9.41-v7+ #1023 SMP Tue Aug 8 16:00:15 BST 2017 armv7l GNU/Linux

Our symptoms:
  • The devices ping
  • Unable to make SSH connections (it says enter login, but when you hit enter after your username it comes back with server disconnected
  • We are unable to launch new processes via our UI
Does anyone have any recommendations on what to look for?
I can't see anything obvious in syslog.

Could this be a slow openfile leak?

Any help would be appreciated.
Last edited by hrsmjjames on Mon Mar 12, 2018 2:02 pm, edited 1 time in total.

hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 1:58 pm

I forgot to add, it sounds very similar to viewtopic.php?t=156287 but we don't use a camera nor have the library mentioned at the end of the post installed

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 26716
Joined: Sat Jul 30, 2011 7:41 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 2:30 pm

Do you have a keyboard/screen on any of the problems devices? Can you run dmesg to see what has been going on?
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed.
I've been saying "Mucho" to my Spanish friend a lot more lately. It means a lot to him.

hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 4:20 pm

Alas we don't it appears to happen on our remote devices. After we get the devices manually rebooted syslog has only shown periodic jobs running every minute and then the reboot and it starting back up.

We are working on trying to recreate the issue locally with a terminal open with keyboard and mouse and even serial console to see if anything kernel related happens but we are struggling at present to recreate.

From one of our syslogs from last week we have:

Code: Select all

Mar  1 06:25:02 raspberrypi liblogging-stdlog:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="311" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar  1 06:25:03 raspberrypi liblogging-stdlog:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="311" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar  1 06:26:01 raspberrypi CRON[7930]: (root) CMD (monitorMSite >/dev/null 2>&1)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Mar  1 06:17:03 raspberrypi kernel: [    0.000000] Booting Linux on physical CPU 0x0
Mar  1 06:17:03 raspberrypi kernel: [    0.000000] Linux version 4.9.41-v7+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611) ) #1023 SMP Tue Aug 8 16:00:15 BST 2017

monitorMSite is a process that runs every minute and ensures our app is running and writes an I'm alive message to a file for us

User avatar
scruss
Posts: 3256
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 4:35 pm

Running out of memory/disk space? If you're logging your keepalive to /tmp, remember it's in RAM.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 5:09 pm

Memory definitely isn't the issue as our keepalive checks this and simply overwrites the file.

I did note that my last syslog message was missing additional log items that were in the roll over file.

Code: Select all

Mar  1 06:24:03 raspberrypi systemd[1]: Stopped User Manager for UID 1001.
Mar  1 06:24:03 raspberrypi systemd[1]: Removed slice User Slice of administrator.
Mar  1 06:25:02 raspberrypi CRON[7761]: (root) CMD (monitorMSite >/dev/null 2>&1)
Mar  1 06:25:02 raspberrypi CRON[7764]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))
I'm now checking the logs across all the devices to see if there is a pattern, i.e is it always when the cron.daily runs after x days

User avatar
scruss
Posts: 3256
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 5:49 pm

Is your keepalive cron job every minute taking longer than a minute to run sometimes?
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.
Pronouns: he/him

hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Mon Mar 12, 2018 6:04 pm

the keepalive is more of a "im alive" it doesn't actually take any hardware actions like rebooting or anything.

hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Tue Mar 13, 2018 4:34 pm

it appears to happen after the devices have been running for 10 days.
Again it doesn't happen to all the devices just a number of them our in the field

hrsmjjames
Posts: 13
Joined: Fri Jan 12, 2018 4:38 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Wed Mar 14, 2018 9:54 am

OK so I've reproduced this in house.


dmesg shows:

Code: Select all

mmc0: card never left busy state
mmc0: error -110 whilst initialising SD Card
EXT4-fs error (device mmcblk0p2): previous I/O error to duperblock detected
We are using Kingston Industrial SD Cards so I know the SD Cards are up to the task.

Also, simply rebooting these brings them back to life

Can anyone advise further?
IMG_0477_resize.jpg
IMG_0477_resize.jpg (201.72 KiB) Viewed 1297 times

jpristel
Posts: 1
Joined: Tue May 19, 2020 5:58 pm

Re: After several days my Pi's stop being able to launch new processes nor allow SSH connections

Tue May 19, 2020 6:00 pm

Did you ever find a solution for this? I'm seeing a similar issue.

Return to “Troubleshooting”