paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

(Is this) The correct ways to install the watchdog package?

Mon May 09, 2016 9:52 am

This post was concocted several years ago to show various ways to make sure applications or the kernel are protected from hang-up issues. Required when you run a server application, security camera or network related devices.

For one of my applications, I needed a watchdog, so I turned to my own post (this one) that I did a number of years ago.
Fortunately, things have progressed since then, and a lot of information has been uncovered or excavated by many contributors. Instead of asking the question in the title if the methods I used was right, we can now be more specific, which is why I adjusted the title of the post somewhat.

Because this post is still getting quite a few hits, I decided to redo this part completely. I hesitated earlier, because most if not all of the subsequent inputs to this post will be made irrelevant. I did not want to impede on the work others did to help with this subject, but the whole description was becoming extremely confusing, even to me :shock:

So, here is a quick and concise summary of the various ways to use the watchdog functionality.
After all the trouble some of us went through to master the watchdog, it basically distilled down to three different methods.

These three methods cannot be combined because the /dev/watchdog device is claimed by either of the methods.

The watchdog device is already activated at boot time for all three methods.
I tried Method 1 and Method 2, which are RPi specific, on an RPi Model B3+ running Stretch, and on the RPi Model 4 running Buster. Both methods work fine on either RPi.

Method 1

The easy shell method is as follows:
With a little script, you can add protection for kernel and user-space program hang-ups.
You start that process by sending a period "." to /dev/watchdog. This will kick-off what I would call a keep-alive session. You, or your program now needs to continue to send a "." to the /dev/watchdog within a 15 second period. If you don't, the RPi will reboot automatically. You can send the character "V" to the device to cancel this process.

You can use the following command to test this out - watch out however, the RPi will reboot in 15 seconds if this is all you do! :

Code: Select all

sudo sh -c "echo '.' >> /dev/watchdog"
Every time you resend this command within a 15 second window, the watchdog counter will be reset. If you stop doing this or wait for more than 15 seconds, the timer overflows, en the RPi gets rebooted.

Creating and activating the following little script (from user sparky777), will protect the RPi for kernel hang-ups.

Code: Select all

#!/bin/bash
echo " Starting user level protection"
while :
   do
      sudo sh -c "echo '.' >> /dev/watchdog"
      sleep 14
   done
When this script gets installed by init.d or systemd at boot time, it most likely runs as root so there is no need to do the "sudo sh -c" trick, you can simply use "echo . >> /dev/watchdog" instead.
I took the easy way and installed it with cron. Just add
@reboot /home/pi/name-of-program
and reboot to install.

When this script runs, there is now protection for kernel related issues. This can be tested with the so called fork bomb.
Make sure the script runs.
Simply type the following sequence at a prompt and then hit return to launch the fork-bomb.

Code: Select all

: (  ){ : | : &  }; : 
The RPi will reboot in about 15 seconds.


Method 2
The second method with the same functionality can be obtained by using systemd.

To let systemd use the watchdog, and to turn it on, you need to edit the systemd configuration file.

Code: Select all

sudo nano /etc/systemd/system.conf 
and change the following line:
#RuntimeWatchdogSec=
to:
RuntimeWatchdogSec=10s
Fifteen seconds is the maximum the BCM hardware allows.
I also suggest you activate the shutdown period protection by removing the '#' in front of the next line.
ShutdownWatchdogSec=10min

After a reboot, this will activate and reserve the watchdog device for systemd use. You can check the activation with :

Code: Select all

dmesg | grep watchdog
It should report something like this on an RPi M3+ with Stretch:
[ 0.784298] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
[ 1.696537] systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
[ 1.696628] systemd[1]: Set hardware watchdog to 10s.
The kernel will now update the hardware watchdog automatically every 10/2 seconds. If there is no kernel activity for 10 seconds, the RPi reboots.
This means that there is a default protection for kernel related issues. This can be tested with the so called fork bomb, see above.

If you want the user-space application protection capability, you have to use the systemd API within your program to do that. This is covered in a later post.

Method 3


The third method is not RPi specific and uses a rather large and sophisticated daemon package (pretty much legacy now) that allows you to set many different parameters that will be able to reboot the RPi. After installation you can use

Code: Select all

man watchdog
for more information, or go here: https://linux.die.net/man/8/watchdog

The package needs to be installed first.

Code: Select all

sudo apt-get install watchdog
Because this is a wide spread legacy package, I'm not going to cover that here.
To set some of the parameters the watchdog daemon should watch :

Code: Select all

nano /etc/watchdog.conf
For the fork bomb test I took away the "#" marks from the following lines:
# This is an optional test by pinging my router
ping=192.168.1.1
max-load-1 = 24
min-memory = 1
watchdog-device = /dev/watchdog
watchdog-timeout = 15
The last line is very important and Rpi specific. If this command is not added, you get a bit of a cryptic error (run sudo systemctl status watchdog.service) :
cannot set timeout 60 (errno = 22 = 'Invalid argument')
This is caused by the default wdt counters used in other Linux systems, because the RPi wdt counter on the SOC only handles a maximum of 15 seconds. Unfortunately, this is a bug that the Foundation missed and should have been programmed into the kernel, or added by default in the watchdog.conf file.


In a follow-up post I will show how to add extra support for your own (Python) application by using the systemd API and framework.

Enjoy!
Last edited by paulv on Sun Jul 07, 2019 9:56 am, edited 25 times in total.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the BCM watchdog?

Thu May 12, 2016 8:50 am

[updated]
This section has been made irrelevant by updates.
Last edited by paulv on Tue Jul 02, 2019 2:13 pm, edited 12 times in total.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the BCM watchdog?

Fri May 13, 2016 7:59 am

This is a follow-up to the first post.

This post has been made irrelevant by updates
Last edited by paulv on Tue Jul 02, 2019 2:14 pm, edited 4 times in total.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the BCM watchdog?

Fri May 13, 2016 2:25 pm

If you want to use the systemd method of using a software watchdog to add control to your own application program, you can use the following method to implement that.

As I showed in the first post, you use the hardware BMC watchdog system to reboot the RPi when the kernel gets unresponsive, or when systemd is no longer operational.

A higher level of control can be added by a software watchdog. Systemd provides that, plus an interface (API) to implement that.
The combination of the two provide the Supervisor chain (in systemd speak).

Ok, so what do you need to do.
There are two steps.

1. You need to provide a service configuration file for systemd to instruct it what to do.
2. You need to add a few things to your own application to make it all work in this environment.

In essence, you are going to ask systemd to initiate a software watchdog, and your application needs to "ping" it at regular intervals. If the application fails to do that, systemd will take action and can ultimately reboot the RPi.

I wrote a service file that will let you test a number of elements.

Code: Select all

# This service installs a python test program that allows us to test the
# systemd software watchdog. This watchdog can be used to protect from hangups.
# On top of that, when the service crashes, it is automatically restarted.
# If it crashes too many times, it will be forced to fail, or you can let systemd reboot
#

[Unit]
Description=Installing Python test script for a systemd s/w watchdog
Requires=basic.target
After=multi-user.target

[Service]
Type=notify
WatchdogSec=10s
ExecStart=/usr/bin/python /home/pi/systemd-test.py
Restart=always

# The number of times the service is restarted within a time period can be set
# If that condition is met, the RPi can be rebooted
#
StartLimitBurst=4
StartLimitInterval=180s
# actions can be none|reboot|reboot-force|reboot-immidiate
StartLimitAction=none

# The following are defined the /etc/systemd/system.conf file and are
# global for all services
#
#DefaultTimeoutStartSec=90s
#DefaultTimeoutStopSec=90s
#
# They can also be set on a per process here:
# if they are not defined here, they fall back to the system.conf values
TimeoutStartSec=2s
TimeoutStopSec=2s

[Install]
WantedBy=multi-user.target
Details can be found if you look for systemd.service(5)
I also wrote a Python script that lets you play with this system and experiment to you hearts delight.

Code: Select all

#!/usr/bin/python2.7
#-------------------------------------------------------------------------------
# Name:        systemd daemon & watchdog test file
# Purpose:
#
# Author:      paulv
#
# Created:     07-05-2016
# Copyright:   (c) paulv 2016
# Licence:     <your licence>
#-------------------------------------------------------------------------------

import sys
import os
from time import sleep
import signal
import subprocess
import socket

init = True

def sd_notify(unset_environment, s_cmd):

    """
    Notify service manager about start-up completion and to kick the watchdog.

    https://github.com/kirelagin/pysystemd-daemon/blob/master/sddaemon/__init__.py

    This is a reimplementation of systemd's reference sd_notify().
    sd_notify() should be used to notify the systemd manager about the
    completion of the initialization of the application program.
    It is also used to send watchdog ping information.

    """
    global init

    sock = None

    try:
        if not s_cmd:
            sys.stderr.write("error : missing s_cmd\n")
            return(1)

        s_adr = os.environ.get('NOTIFY_SOCKET', None)
        if init : # report this only one time
            sys.stderr.write("Notify socket = " + str(s_adr) + "\n")
            # this will normally return : /run/systemd/notify
            init = False

        if not s_adr:
            sys.stderr.write("error : missing socket\n")
            return(1)

        sock = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
        sock.sendto(s_cmd, s_adr)
        # sendto() returns number of bytes send
        # in the original code, the return was tested against > 0 ???
        if sock.sendto(s_cmd, s_adr) == 0:
            sys.stderr.write("error : incorrect sock.sendto  return value\n")
            return(1)
    except e:
        pass
    finally:
        # terminate the socket connection
        if sock:
            sock.close()
        if unset_environment:
            if 'NOTIFY_SOCKET' in os.environ:
                del os.environ['NOTIFY_SOCKET']
    return(0) # OK


def sig_handler (signum=None, frame = None):
    """
    This function will catch the most important system signals, but NOT a shutdown!
    During testing, you can use this code to see what termination methods are used or filter
    some out.

    This handler catches the following signals from the OS:
        SIGHUB = (1) SSH Terminal logout
        SIGINT = (2) Ctrl-C
        SIGQUIT = (3) ctrl-\
        IOerror = (5) when terminating the SSH connection (input/output error)
        SIGTERM = (15) Deamon terminate (deamon --stop): is coming from deamon manager
    However, it cannot catch SIGKILL = (9), the kill -9 or the shutdown procedure
    """

    try:
        print "\nSignal handler called with signal : {0}".format(signum)
        if signum == 1 :
            sys.stderr.write("Sighandler: ignoring SIGHUB signal : " + str(signum) + "\n")
            return # ignore SSH logout termination
        sys.stderr.write("terminating : python test script\n")
        sys.exit(1)

    except Exception as e: # IOerror 005 when terminating the SSH connection
        sys.stderr.write("Unexpected Exception in sig_handler() : "+ str(e) + "\n")
        subprocess.call(['logger "Unexpected Exception in sig_handler()"'], shell=True)
        return

def main():

    # setup a catch for the following termination signals: (signal.SIGINT = ctrl-c)
    for sig in (signal.SIGTERM, signal.SIGINT, signal.SIGHUP, signal.SIGQUIT):
        signal.signal(sig, sig_handler)

    # get the timeout period from the systemd-test.service file
    wd_usec = os.environ.get('WATCHDOG_USEC', None)
    if wd_usec == None or wd_usec == 0:
        sys.stderr.write("terminating : incorrect watchdog interval sequence\n")
        exit(1)

    wd_usec = int(wd_usec)
    # use half the time-out value in seconds for the kick-the-dog routine to
    # account for Linux housekeeping chores
    wd_kick = wd_usec / 1000000 / 2
    sys.stderr.write("watchdog kick interval = " + str(wd_kick) + "\n")

    try:
        sys.stderr.write("starting : python daemon watchdog and fail test script started\n")
        # notify systemd that we've started
        retval = sd_notify(0, "READY=1")
        if retval <> 0:
            sys.stderr.write("terminating : fatal sd_notify() error for script start\n")
            exit(1)

        # after the init, ping the watchdog and check for errors
        retval = sd_notify(0, "WATCHDOG=1")
        if retval <> 0:
            sys.stderr.write("terminating : fatal sd_notify() error for watchdog ping\n")
            exit(1)

        ctr = 0 # setup a counter to initiate a watchdog fail
        while True :
            if ctr > 5 :
                sys.stderr.write("forcing watchdog fail, restarting service\n")
                sleep(20)

            sleep(wd_kick)
            sys.stderr.write("kicking the watchdog : ctr = " + str(ctr) + "\n")
            sd_notify(0, "WATCHDOG=1")
            ctr += 1


    except KeyboardInterrupt:
        print "\nTerminating by Ctrl-C"
        exit(0)


if __name__ == '__main__':
    main()
The comments should give you an idea of what is needed. In a nutshell, the application needs to signal systemd that it has finished the initialization. At regular intervals, the software watchdog is updated. There is a fail condition in the code above that will mimic a hung application.

Here is how you install and test this all.
Open an editor:

Code: Select all

nano systemd-test.service
Copy and paste the service code above into the editor. Save the file and close the editor. Copy this file into the systemd structure with :

Code: Select all

sudo cp systemd-test.service /etc/systemd/system
Open an editor again:

Code: Select all

nano systemd-test.py
Copy and paste the Python code above into the editor. Save the file and close the editor. Make the python script executable :

Code: Select all

chmod +x systemd-test.py
Run the service script in the systemd environment :

Code: Select all

sudo systemctl start systemd-test
Watch what is going on with

Code: Select all

tail -f /var/log/syslog
After 4 failures and automatic restarts of the python script, systemd declares it a failed state. You can also let the RPi reboot when this happens and all you need to do is to change StartLimitAction=none to StartLimitAction=reboot in the systemd-test.service file.

If you would like to test the application within the boot process, run this :

Code: Select all

sudo systemctl enable systemd-test
After a reboot, you can again watch it all by using the above tail command again.
If you decide to change the Python script, you can do that while the system is running. At the next restart, the new code is automatically loaded and executed. If you want to change parameters in the .service file, you can do that too, but you need to activate and reload those changes. You do that with

Code: Select all

sudo systemctl daemon-reload
and then

Code: Select all

sudo systemctl restart systemd-test
I had great fun to discover all the possibilities systemd now offers me to add better control to my own scripts.

Please chime in if you have improvements or suggestions!

Enjoy!
Last edited by paulv on Tue Jul 02, 2019 2:17 pm, edited 1 time in total.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Wed May 18, 2016 7:55 pm

I have not received any comments or replies to this post, nor on a request on stackoverflow.
http://stackoverflow.com/questions/3714 ... th-systemd

I filed a bug report hoping that will lead to a solution or clarification.
https://bugs.launchpad.net/raspbian/+bug/1582707

sskn
Posts: 1
Joined: Sun Jun 05, 2016 5:14 am

Re: Is this the correct way to install the watchdog package?

Sun Jun 05, 2016 5:20 am

Thanks a lot!
I successfully activated the watchdog timer with simple way (2nd post).
I was confused because I couldn't install and activate the watchdog package using the way for old version of raspbian.

ejolson
Posts: 3415
Joined: Tue Mar 18, 2014 11:47 am

Re: Is this the correct way to install the watchdog package?

Mon Jun 06, 2016 8:08 am

paulv wrote:I have not received any comments or replies to this post, nor on a request on stackoverflow.
http://stackoverflow.com/questions/3714 ... th-systemd

I filed a bug report hoping that will lead to a solution or clarification.
https://bugs.launchpad.net/raspbian/+bug/1582707
This is a very interesting post. While reading the first part one starts to sympathize with the graybeards who who claim systemd is trying to take over the world and ruin it. The second post then points out how nicely systemd directly interfaces with the watchdog hardware.

Eric Raymond said, "...it looks like Plan 9 failed simply because it fell short of being a compelling enough improvement on Unix to displace its ancestor." At this point in time systemd has replaced the System V init deamon in almost all major Linux distributions. Hopefully the resulting improvement will be worth the hardships caused.

User avatar
torekk
Posts: 12
Joined: Mon May 16, 2016 1:07 am
Contact: ICQ

Re: Is this the correct way to install the watchdog package?

Tue Jun 07, 2016 6:09 pm

Thank you for this.

However I noticed that when I shutdown the Pi manually, it doesn't reboots itself after 10 seconds? Or is that what the Shutdown setting is for, seeing as that's set so 10minutes?

Anyways that's actually what I wanted, watchdog to reboot the Pi once it freezes, but not if I shut it down manually.

onefastt997
Posts: 1
Joined: Sat Sep 03, 2016 1:51 am

Re: Is this the correct way to install the watchdog package?

Sat Sep 03, 2016 1:53 am

Thanks for this. It doesn't power back up when I do sudo poweroff but it definitely recovers from the fork bombs.

Samweis
Posts: 1
Joined: Mon Sep 26, 2016 8:07 am

Re: Is this the correct way to install the watchdog package?

Mon Sep 26, 2016 8:21 am

Thanks for this!

For whatever reason the file /etc/systemd/system/watchdog.service.d/local.conf did not work for me,
but

Code: Select all

ln -s /lib/systemd/system/watchdog.service /etc/systemd/system/multi-user.target.wants/
did the trick.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Mon Sep 26, 2016 5:21 pm

Hi Samweis,

This is most likely the appropriate way of getting the watchdog to behave with systemd.
I found that out using another package, so this is probably the way to go.

Thanks for contributing!

Paul

shalvan
Posts: 4
Joined: Sat Oct 01, 2016 3:11 pm

Re: Is this the correct way to install the watchdog package?

Sun Oct 02, 2016 5:25 am

Hello there,

I have a problem witch watchdog on Raspberry pi b+ 512mb with fresh jessie install.
I did everything like You, but when i run command:

Code: Select all

 sudo systemctl start watchdog
nothing happens

When i was using Wheezy watchdog worked like charm, and now it donst :(

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Sun Oct 02, 2016 12:50 pm

shalvan,

You're not very specific.
What does the status report tell you? Can you post that?

shalvan
Posts: 4
Joined: Sat Oct 01, 2016 3:11 pm

Re: Is this the correct way to install the watchdog package?

Sun Oct 02, 2016 1:02 pm

this is my status:

Code: Select all

[email protected]:~ $ sudo systemctl status watchdog
● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled)
  Drop-In: /etc/systemd/system/watchdog.service.d
           └─local.conf
   Active: inactive (dead)
when i start with :

Code: Select all

sudo systemctl start watchdog
all i see is blinking cursor one line below my command and nothing else happens. Just like the command is trying to do something but still nothing.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Sun Oct 02, 2016 1:14 pm

Are you on the same software version that I used to report my post?
Linux raspi-svr 4.4.9+ #884 Fri May 6 17:25:37 BST 2016 armv6l GNU/Linux
If you are on a newer version, you can thank the powers to be for yet another change of the goalposts, and I can't help you at this moment. Maybe somebody else already knows what changed this time...

Note that nobody from the Foundation chimed in to help with my post or clarify what is going on. :(

Good luck!

shalvan
Posts: 4
Joined: Sat Oct 01, 2016 3:11 pm

Re: Is this the correct way to install the watchdog package?

Sun Oct 02, 2016 1:23 pm

My version is:

Code: Select all

Linux RasPi 4.4.17+ #902 Mon Aug 15 12:17:32 BST 2016 armv6l GNU/Linux
but i am experiencing this problem from febuary, it was then when i switched to Jessie.

gkoper
Posts: 13
Joined: Sun Mar 10, 2013 2:53 pm

Re: Is this the correct way to install the watchdog package?

Tue Oct 11, 2016 5:19 pm

Dear paulv

Thanks for your extensive description of the results of your research. Impressive!

I did some tests on a more recent version of Rasbian, the version is Jessie Lite

Code: Select all

Linux TestPi2 4.4.21+ #911 Thu Sep 15 14:17:52 BST 2016 armv6l GNU/Linux
As you already surmised, things would be different for later versions and indeed they are!

First of all, the watchdog device is already loaded from the start, although there is no entry in the /boot/config.txt file.

I had a preference for the second method using the systemd. The entries could indeed be set in system.conf file and after booting one could find the watchdog records in the syslog. So far so good. However, neither the simple test involving poweroff nor a fork bomb would make the watchdog bark and cause a reboot! I did find out, that the /dev/watchdog was allocated after setting a nonzero RuntimeWatchdogSec in system.conf though: apparently something had happened! Since I did not have a clue how to proceed, I left this issue alone.

I briefly turned to the first option, using the watchdog package but I stopped as it would not serve my purpose. I would like to monitor a program I wrote to continuously send weather reports to wunderground.com. It already runs for weeks without problems. However, sometimes Rasbian causes a problem that hangs the program or there is a hardware issue. Both are typically resolved by rebooting but that so far had to be done manually.

I found out that the method (http://binerry.de/post/28263824530/rasp ... hdog-timer) described by Binerry actually can be used still despite being 4 years old. The only requirement is to have free access to /dev/watchdog so that means that in system.conf the parameter RuntimeWatchdogSec should be set to zero.

Noting your success to get more information on Raspbian issues added up to my own ("this may be resolved in a later version ...") I think on the short run the Binerry method is the preferred one. Nevertheless, I would like to know how to deal properly with the "second" method involving the systemd. This is the standard Linux method of dealing with the watchdog but maybe it is still under construction? Hope this invites some useful comments.

gkoper

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Tue Oct 11, 2016 7:02 pm

Hi gkoper,

Thanks for contributing to this post.
My own experience, after several years of working with the RPi's, is that the RPi environment is still very much a work in progress (no, that's a bit too generous, let's call it process). As other users regularly comment, the goal posts seem to get moved just about every update. What works with one release may not work with the next, and the forums are full of users that are left dangling.

Alas, back to the topic. I have found, through another project of mine, that the watchdog is initiated automatically by systemd as soon as you invoke the following in a service setup, as an example:

Code: Select all

[Service]
# Add restart options
Restart=always
RestartSec=5
StartLimitBurst=4
StartLimitInterval=180s

# Add optional rebooting options:
StartLimitAction=reboot
This makes sense, but unfortunately, this is not mentioned anywhere, at least not as far as I could find, but that's not much of a guarantee either.

I guess we have to wait until the fog clears, or when enough users start to complain to raise the issue.

If that sounds skeptical, it is caused by the growing frustration of working with the RPi's for several years now and contribution many posts and designs that continuously need tweaking, fixing and updating. As a result, I switched platforms and interest levels and so the Pi has taken a back-seat.

Good luck!
Last edited by paulv on Mon Jan 09, 2017 10:48 am, edited 1 time in total.

gkoper
Posts: 13
Joined: Sun Mar 10, 2013 2:53 pm

Re: Is this the correct way to install the watchdog package?

Wed Oct 12, 2016 7:43 am

Thanks paulv

I will have a look later.

Regarding any support, please note that upon startup Raspbian says:

Code: Select all

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
In other words, no promises made.

gkoper

no3rpi
Posts: 13
Joined: Fri Mar 31, 2017 11:44 am

Re: Is this the correct way to install the watchdog package?

Fri Mar 31, 2017 12:12 pm

paulv I want to thank you for investigating and solving this really important issue with watchdog ( at least for me ).

I am using rpi3 as private email server and cctv monitor, headless in a remote location and unfortunately until now in 2 years I had several incidents when it stop respond to any network traffic ( ssh, tcp, udp ....) and had to be hard reset - restarted by hand, despite using monit and other redundant solution to allow me to remote access it.

Today after last incident I implemented your solution from second post and I want to confirm that until now it seems to work ok for this raspberry pi 3:

Code: Select all

uname -a:
Linux rpi3 4.4.50-v7+ #970 SMP Mon Feb 20 19:18:29 GMT 2017 armv7l GNU/Linux
/opt/vc/bin/vcgencmd version:
Mar  3 2017 13:43:37 
Copyright (c) 2012 Broadcom
version 9ae30f71c7ef4239e9d5b56346c0842f3ef56736 (clean) (release)
I edited:

Code: Select all

/etc/systemd/system.conf with:
RuntimeWatchdogSec=10
ShutdownWatchdogSec=10min
reboot and tested: cat /var/log/syslog | grep watchdog

Code: Select all

Mar 30 11:17:05 rpi3 kernel: [    8.096998] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
Mar 31 10:06:52 rpi3 kernel: [    6.056467] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
Mar 31 10:06:52 rpi3 systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
Mar 31 10:06:52 rpi3 systemd[1]: Set hardware watchdog to 10s.
Mar 31 10:06:53 rpi3 systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
Mar 31 10:06:53 rpi3 systemd[1]: Set hardware watchdog to 10s.
Mar 31 10:06:53 rpi3 kernel: [    6.797667] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
tested with your bomb script and rpi3 is restarting ok;
for the moment I did not tested if at "shutdown -h now" it will be restarted automatically or not.

I hope watchdog solution will solve hang out problem from now on...

thank you.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Fri May 19, 2017 3:10 pm

This post was made irrelevant by updates
Last edited by paulv on Tue Jul 02, 2019 2:18 pm, edited 7 times in total.

no3rpi
Posts: 13
Joined: Fri Mar 31, 2017 11:44 am

Re: Is this the correct way to install the watchdog package?

Fri May 19, 2017 3:49 pm

Yes it is true and really bad news.

Code: Select all

uname -a
Linux rpi3 4.9.24-v7+ #993 SMP Wed Apr 26 18:01:23 BST 2017 armv7l GNU/Linux

Code: Select all

cat /var/log/syslog | grep watchdog
nothing to display from log... $#&!

no3rpi
Posts: 13
Joined: Fri Mar 31, 2017 11:44 am

Re: Is this the correct way to install the watchdog package?

Mon May 22, 2017 1:41 pm

@paulv
editing a previous post is not a good idea if you add new / update info because there is no notification for users flowing this thread.
At least please add a last message with what post is updated if something important you discovered/changed.

By chance I re-read all this thread today and I found that I also need to have watchdog activated also in config.txt beside /etc/systemd/system.conf file I only used until now...

now I can see in log:

Code: Select all

cat syslog | grep watchdog
May 22 16:20:58 rpi3 kernel: [    0.813864] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
thank you.

brunohpg
Posts: 1
Joined: Sat Jun 03, 2017 7:22 pm

Re: Is this the correct way to install the watchdog package?

Sat Jun 03, 2017 11:12 pm

Hello,

Thanks @paulv for your research and for posting the results.
I was able to start the watchdog successfully. But now I have some (new) problems.

First, I did all the tests in the version:

Code: Select all

Linux raspberrypi 4.4.50-v7+ #970 SMP Mon Feb 20 19:18:29 GMT 2017 armv7l GNU/Linux
Almost everything worked as expected. Message were logged and fork bomb performs reboot. I thought: Great! I have the solution!

So I've been enable watchdog systemd on:

Code: Select all

Linux raspberrypi 4.9.24-v7+ #993 SMP Wed Apr 26 18:01:23 BST 2017 armv7l GNU/Linux
Following these steps:

1º Configure /boot/config.conf by adding:

Code: Select all

# activating the hardware watchdog
dtparam=watchdog=on
2º Configure /etc/systemd/system.conf by changing:

Code: Select all

RuntimeWatchdogSec=10
ShutdownWatchdogSec=5min
Only that! It works. At least partially.
But not all things worked as expected.

The messages are no longer logged. There is only one message in the log:

Code: Select all

Jun  3 18:48:46 raspberrypi kernel: [    0.831289] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
The system keeps restarting with fork bomb, but it is lazy.
When fork bomb is run by root, the system reboot (properly).
When fork bomb is run by pi user, the watchdog does not catch. I get a lot of messages like:

Code: Select all

./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: fork: Cannot allocate memory
./forkbomb.sh: xmalloc: .././array.c:581: cannot allocate 24 bytes (40960 bytes  


The system restarted after a few minutes. But I'm afraid it will not always work.
These messages are also displayed using the watchdog package. However, I believe the watchdog package memory limit test will reboot the device first.

So I went to install the watchdog package. With the steps:

1º Enabling watchdog in /boot/config.txt by adding:

Code: Select all

# activating the hardware watchdog
dtparam=watchdog=on
2º Installing the watchdog package:

Code: Select all

sudo apt-get install watchdog
3º Modifying the /etc/watchdog.conf (others lines remaing not changed):

Code: Select all

max-load-1              = 24
min-memory              = 1
watchdog-device = /dev/watchdog
watchdog-timeout=15
4º Linked the service to start on boot (thanks to @Samweis):

Code: Select all

ln -s /lib/systemd/system/watchdog.service /etc/systemd/system/multi-user.target.wants/
5º Reboot the raspberry pi.

That is it!

And now I have:

Code: Select all

[email protected]:/home/pi# ps -Af | grep watchdog
root        36     2  0 19:45 ?        00:00:00 [watchdogd]
root       822     1  0 19:45 ?        00:00:00 /usr/sbin/watchdog
root       902   884  0 19:46 pts/1    00:00:00 grep watchdog
And...

Code: Select all

[email protected]:/home/pi# sudo systemctl status watchdog
● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled)
   Active: active (running) since Sáb 2017-06-03 19:45:26 -03; 3min 30s ago
  Process: 820 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
  Process: 815 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (code=exited, status=0/SUCCESS)
 Main PID: 822 (watchdog)
   CGroup: /system.slice/watchdog.service
           └─822 /usr/sbin/watchdog

Jun 03 19:45:26 raspberrypi watchdog[822]: int=1s realtime=yes sync=no soft=no mla=24 mem=1
Jun 03 19:45:26 raspberrypi watchdog[822]: ping: no machine to check
Jun 03 19:45:26 raspberrypi watchdog[822]: file: no file to check
Jun 03 19:45:26 raspberrypi watchdog[822]: pidfile: no server process to check
Jun 03 19:45:26 raspberrypi watchdog[822]: interface: no interface to check
Jun 03 19:45:26 raspberrypi watchdog[822]: temperature: no sensors to check
Jun 03 19:45:26 raspberrypi watchdog[822]: test=none(0) repair=none(0) alive=/dev/watchdog heartbeat=none to=root no_act=no force=no
Jun 03 19:45:26 raspberrypi watchdog[822]: watchdog now set to 15 seconds
Jun 03 19:45:26 raspberrypi watchdog[822]: hardware watchdog identity: Broadcom BCM2835 Watchdog timer
Jun 03 19:45:26 raspberrypi systemd[1]: Started watchdog daemon.
Fork bomb now reboot the device in a few seconds.
I believe it's better than the systemd watchdog. :D

I do not think there's anything really new here. Just a few considerations and a step by step to start the watchdog in the newer versions of the kernel.

I 'm using Raspberry PI 3.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Is this the correct way to install the watchdog package?

Sun Jun 04, 2017 6:57 pm

Hi brunohpg,

Thanks for taking the time to report your findings.
I can concur with your systemd findings, mine are exactly the same.
Something got broken somewhere, and it seems the watchdog package still works.
I hope a fix can be found and communicated.

Let me point out to others reading this post that the two systems are not compatible. The hardware watchdog device is strictly single user, and it cannot be shared among the native systemd method and the watchdog package. They are different systems.

Return to “Advanced users”