ubee
Posts: 4
Joined: Thu Oct 22, 2015 3:50 pm

Headless heating control system hangs

Thu Oct 22, 2015 4:04 pm

I have implemented a heating control system with a Pi running Domoticz home automation server, a few web pages and a control loop implemented as a python script. I'm running this system headless, and I have a problem the system hangs after a week (or so) flawless operation. The Apache webserver does not respond and I can't login remote with ssh (Putty). I have implemented a basic watchdog function that monitors the Domoticz process and the Python script. Beside this the watchdog function ichecks network connectivity by pinging my local router. The watchdog is implemented as a shell scrip executed as a cron job every 10th minute.

The thing is when the system stops responding, the watchdog still kicks in every 10th minute and everythings looks OK. The Domoticz and Python process is up and running, and I have have network connectivity. So the system seems to run quite OK, but I can't login nor open any web page. If I try to login with Putty, I can see the sequence when SSH is validating keys etc. but it looks like the system fails to start a shell session. Nothing happens after the init sequence. No prompt - nothing.

Does anyone have a clue what could be the reason behind this strange behaviour? I can't see anything strange in the log-files under /var/log.

stderr
Posts: 2178
Joined: Sat Dec 01, 2012 11:29 pm

Re: Headless heating control system hangs

Thu Oct 22, 2015 11:15 pm

> The thing is when the system stops responding, the watchdog still kicks
> in every 10th minute and everythings looks OK
>
You know this from the log files? The watchdog is on the RPi, right? If it's a "watchdog", it should be able to test if each process that matters is running and restart the ones that are not, logging failures.

Is this a 512Meg board? Are you running all this under the command line only or are you running X too? What if you are running out of memory, I'm wondering. Does your system really need the full Apache or could a more modest web server do things in a lighter way?

I would just stay logged in ssh all the time. You certainly need to have a way to at least try to find out what is going on at those times when you cannot restart ssh or anything else. I've certainly seen it where I can ping a machine running Linux and I can't get it to ssh because of lack of memory. Consider using something like top to log your situation over time. But just ssh in and leave top running might be enough.

cleverca22
Posts: 483
Joined: Sat Aug 18, 2012 2:33 pm

Re: Headless heating control system hangs

Fri Oct 23, 2015 8:14 am

another option is to use "setterm -blank 0 -powerdown 0" in rc.local to disable the power saving features (they don't save power anyways)

then the screen will work when you plug in a tv, even if it has locked up fully

ubee
Posts: 4
Joined: Thu Oct 22, 2015 3:50 pm

Re: Headless heating control system hangs

Fri Oct 23, 2015 12:35 pm

Thanks for your inpuit. Yes, I monitor my "watchdog" by means of a log file, so I know this is running and the monitored processes as well. I have 512k memory on this Pi, so you might touch on something refering to memory shortage as a potential root cause of my problem.

Actually Domoticz monitor internal memory usage. I attach a screen dump of the graph. I had a period of no access to the system between Nov 16th and 20th. I had a connectivity problem by the 20th, and that triggered a reboot and I recovered from the problem. But the memory peak actually occured before the system stopped to respond by the 20th, and that puzzles me. I don't know if 70% of the memory used is an alarming figure. I might replace the HW and use a RP2 instead and by that I will have higher margins.
Capture.JPG
Capture.JPG (50.71 KiB) Viewed 1423 times

stderr
Posts: 2178
Joined: Sat Dec 01, 2012 11:29 pm

Re: Headless heating control system hangs

Fri Oct 23, 2015 7:04 pm

ubee wrote:you might touch on something refering to memory shortage
It seems like the sort of behaviour. I don't have the RPi2 but what I read seems to say it can't run the old OS directly. If that's true, then you couldn't just swap the card and test without any effort. Pity.

> But the memory peak actually occured before the system stopped to
> respond by the 20th, and that puzzles me.
>
What if the system is killing processes because it is running low on memory and the effect of those killed processes takes some time to matter to your operations? I mean sometimes things are killed and they restart and that's fine, but some things aren't so good at doing that or they might try a limited number of times.

If you are running without any swap, which on cheap cards from the discount place (or wherever) is probably a good idea, you are running in a way that the system will kill processes because it has no other option. Switching out to a board with a gig of RAM and/or lightening the software a bit would seem sensible. I would also just say ssh to the RPi so you hopefully can keep access when it otherwise won't let you log back in.

ubee
Posts: 4
Joined: Thu Oct 22, 2015 3:50 pm

Re: Headless heating control system hangs

Sat Oct 24, 2015 12:48 pm

Sounds like a resonable theory. After the peak Nov 16th Raspbian starts to kill processes, and then I get stuck. I will add a new cron job that reboots the system once a day. I guess this is easier than nailing down what actually consumes memory over time. Or replacing the board... ;)

Return to “Troubleshooting”