User avatar
TimG
Posts: 267
Joined: Tue Apr 03, 2012 12:15 am
Location: Switzerland

Watchdog "ping" function

Mon Dec 02, 2013 3:59 pm

I've been using watchdog (http://linux.die.net/man/8/watchdog) to automatically reset my RPi if it gets stuck. It runs a daemon which monitors CPU load, free memory, network status and other things, and triggers a reboot if certain conditions are met.

Unfortunately there seems to be a problem with version 5.12 in the repositories. The ping function always fails, sending the RPi into an endless cycle of reboots. If you try it, be sure to disable the automatic restart of the watchdog daemon first:
- in /etc/default/watchdog set run_watchdog=0
- then run the watchdog manually with sudo /usr/sbin/watchdog -v and monitor the output with tail -f /var/log/daemon.log

The latest version (v5.13 available here: http://sourceforge.net/projects/watchdo ... t/download) doesn't seem to have this problem, although it needs to be compiled. Luckily it's not difficult, and there are instructions in the included INSTALL file. I've also made a deb, which you can find here: http://tgiles.icern.ch/binaries/watchdo ... _armhf.deb. Install in the usual way, with sudo dpkg -i watchdog_5.13-1_armhf.deb

Hope that's useful to somebody.

User avatar
TimG
Posts: 267
Joined: Tue Apr 03, 2012 12:15 am
Location: Switzerland

Re: Watchdog "ping" function

Mon Dec 02, 2013 6:18 pm

Update: although the ping function seems ok over wired ethernet, its less reliable over WiFi. I'm still looking for a way to fix it. Meanwhile, beware of infinite reboot loops.

User avatar
TimG
Posts: 267
Joined: Tue Apr 03, 2012 12:15 am
Location: Switzerland

Re: Watchdog "ping" function

Tue Dec 03, 2013 11:29 pm

I think I sussed it. The installation package creates two sets of startup scripts: one for the main watchdog program and one for another binary in the same package called wd_keepalive. (The naming is slightly confusing: "watchdog" can refer to either the hardware timer which resets the CPU, or the software daemon which controls the reset timer.) Anyway, to cut a long story short wd_keepalive conflicts needs to be removed. Once this is done the ping test runs correctly.

Here's the complete installation recipe:
  • $ wget http://tgiles.icern.ch/binaries/watchdo ... _armhf.deb
  • $ sudo dpkg -i watchdog_5.13-1_armhf.deb # Install Watchdog daemon
  • $ sudo update-rc.d -f wd_keepalive remove # Remove conflicting init scripts
  • $ sudo nano /etc/default/watchdog # This file should contain the following:
  • [code]# Start watchdog at boot time? 0 or 1 run_watchdog=1 # Load module before starting watchdog watchdog_module="bcm2708_wdog" # Specify additional watchdog options here (see manpage).[/code]
  • $ sudo nano /etc/watchdog.conf # This is my configuration; alter to taste
  • [code]ping = 192.168.0.1 ping-count = 3 max-load-1 = 24 min-memory = 1024 watchdog-device = /dev/watchdog interval = 5 realtime = yes priority = 1[/code]
  • $ sudo /etc/init.d/watchdog restart #Start the watchdog daemon
For those who want the gruesome details: wd_keepalive is supposed to prevent the watchdog (hardware) from timing out during boot. After boot-up watchdog (software) is meant to kill wd_keepalive and take over. Problems with the startup scripts mean that this isn't happening: wd_keepalive continues to run and conflicts with watchdog (software), causing the watchdog (hardware) to time-out prematurely. In fact wd_keepalive isn't needed, because the watchdog (hardware) isn't enabled until watchdog (software) loads. There, clear as mud.

bwilly
Posts: 1
Joined: Tue Feb 25, 2014 5:21 pm

Re: Watchdog "ping" function

Tue Feb 25, 2014 5:37 pm

This worked. Thank you. I will re-post important line and keyword, as it took a week+ of frustration before I came across this post.

Raspberry Pi watchdog ping does not work if following most configurations. You must also add the following line to file.

File: /etc/default/watchdog
Add line: watchdog_module="bcm2708_wdog"

That one change, in addition to normal config setup, solved my problem.

greavette
Posts: 83
Joined: Thu Oct 04, 2012 4:25 pm

Re: Watchdog "ping" function

Wed Jun 18, 2014 2:45 am

Hello,

I'm hoping that someone is still watching this thread. I've tried downloading the deb file from this post but it's not available any longer (http://linux.die.net/man/8/watchdog).

I've tried downloading the watchdog version from the jessie repo but now I'm getting a segmentation fault when I try to start /etc/init.d/watchdog.

Could the deb file be made available again so I try installing this again.

Thank you.

dansrasppi
Posts: 1
Joined: Thu Jun 19, 2014 1:42 am

Re: Watchdog "ping" function

Thu Jun 19, 2014 1:47 am

I was able to get a working, non-segfaulting deb from:

http://mirrordirector.raspbian.org/rasp ... _armhf.deb

So far, ping seems to work fine.

greavette
Posts: 83
Joined: Thu Oct 04, 2012 4:25 pm

Re: Watchdog "ping" function

Fri Jun 20, 2014 9:49 pm

Thanks for the reply dansrasppi. Not sure what I did wrong previously but the link you provided is working well. I'm not getting the seg fault errors anymore and watchdog is running well.

I've done some testing with pointing to a computer to see if my Pi reboots if the ping fails when that watched computer is shutdown. It brings up a few questions for me now:

[*]Can I watch more than 1 computer? Ideally I would watch my router but if I watched two computers this would be safer in my opinion. That way I can reboot one of the two 'watched' computers without rebooting my Pi. I've tried adding two 'ping' entries to my watchdog.conf file but if I reboot one of the 'watched' computers my Pi still reboots? Is what I want to accomplish possible?
[*]What is the best ping setting so that my Pi doesn't reboot if a 'watched' computer is simply rebooted? What's the interval to set that would minimize my Pi reboots if I wanted to simply reboot a 'watched' computer?

Thanks!

Return to “Raspbian”