paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Failsafe(r) use of GPIO pin driving critical applications

Wed Jun 12, 2013 1:56 pm

I'm using my Pi as a web-based thermostat in my second home, on another continent. It is supposed to keep the temperatures in the house above a minimum, and below a maximum (it's in a hot climate!)

Needless to say, when the pi crashes while the HVAC system is heating or cooling, it may continue to do so until (hopefully) a fail safe in the HVAC system itself stops the process. This could cause some serious problems or damage, and I'm sure there are other applications where this is not desired.

Luckily, I stumbled on a comment in another post that just looked what I was after. And I decided to try to implement that solution.
The basic idea is that when Debian or the CPU of the Pi crashes, a GPIO pin may get stuck.
By using PWM pulses as kind of hart-beat, when the CPU or Debian crashes, it will no longer send out PWM pulses. I cannot find a verification for that fact, but it seems very logical. Anyway, by using the PWM feature of the GPIO drivers to drive a relays with some additional hardware, we can hopefully be sure that the relays will not get stuck in a position due to a driving GPIO pin being stuck.

The original post I found is : http://www.raspberrypi.org/phpBB3/viewtopic.php?f=44&t=45365&p=359936&hilit=ltspice#p359936
Tage, the contributor, put me on the right track, so I started to implement his solution.

I used his schematic as a starting point, while learning LTSpice myself. Unfortunately, I cannot use SMD components in my application, and the p-channel MOSFET he is using, the BSS84, or an equivalent, is not available in a through-hole package. I searched hi and low and could only find an alternative in the LP0701N3-G, and I could only locate that at Mouser.com. They are now on order, but, my interest was peaked and I did not want to wait.

Instead of using the p-channel MOSFET in what I would call a voltage level design, I changed the circuit to a current level design with a simple PNP transistor, the 2N3906.

The LTSpice results are here:
GPIO-PWM-Failsafe.jpg
GPIO-PWM-Failsafe.jpg (56.73 KiB) Viewed 7073 times
(use your magnifier settings to view the picture. The limit on file sizes in posts is a real pain!!!)
Dark blue trace is output, green is base of Q2 and light-blue is the collector of Q1. (note that I used a VCC of 4.7 Volt, to test the dependence of the circuit with lower voltages)

The resulting output has dips in it, but that is only because I adjusted the values in LTSpice to the real parts I used, and this is mostly due to toleration in the HFE of the 2N3906 and other components. The parts you may have to adjust are C2 (470NF) and R3 (12K) to get the best results. Note that the base current of the 2N3906 is charging C2 through R3, not really by R2 as in the voltage (BSS84) solution from Tage does. The relays load is also a determining factor for the base current. I use a very tiny relays, the V23026A1001B201, which has a coil resistance of 370 Ohm for the 5Volt version. I could not find the inductance value so I just put in 100mH but it did not matter match for the simulation.
The scope picture is taken from the real circuit, again using the values that are in the simulation schematic.
Scope.jpg
This is the actual result
Scope.jpg (46.05 KiB) Viewed 7073 times
Top trace (A) is from the output, the bottom trace (B) is taken from the right side of R1.
(a high voltage on the output (A) means that the relays is powered, a low means it's off)
I happened to use the following pins on the GPIO connector to power my circuit: VCC (5V) is pin 2 on the GPIO connector, GND is pin 9 and the PWM pulses are coming from GPIO-22 on pin 15 straight to R1.

To get this output, I used the following Python code on the Pi:

Code: Select all

#!/usr/bin/env python2.7

import RPi.GPIO as GPIO
from time import sleep

PIN = 22 # GPIO pin 22 = pin 15
GPIO.setmode(GPIO.BCM)
GPIO.setup(PIN, GPIO.OUT)
HVAC = GPIO.PWM(PIN, 100)   # channel 12, 100 Hz

HVAC.start(50)   # dutycycle 50% (0=DC)
sleep (0.1)
#raw_input('Press return to stop:')   # use raw_input for Python 2
HVAC.stop()

GPIO.cleanup()
During testing, you can take the comment away from the raw_input line, and add one to the sleep command.
With a frequency of 100Hz, and a sleep time of 0.1 seconds, you get 10 pulses, which is what is used in the simulation and for the real thing.

I would be very interested to hear from those that have other ideas and maybe have enhancements.
Last edited by paulv on Thu Sep 19, 2013 10:06 am, edited 2 times in total.

User avatar
joan
Posts: 14376
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 12, 2013 2:15 pm

Interesting.

I propose a competition to find the cheapest design to switch off a relay when the Pi crashes. Still powered, just crashed.

Prize - 1 (one) virtual hug.

PiGraham
Posts: 3666
Joined: Fri Jun 07, 2013 12:37 pm
Location: Waterlooville

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 12, 2013 2:34 pm

You may want to think about something more robust that a simple watchdog. What if your application code crashes? What if the GPIO fails?
You could consider a redundant system where each subsystem must agree on the control state.
In a simple heating-cooling controller you could have two Pis monitoring separate temperature sensors.
If you connect each to control relays, and connect the relays in series then the HVAC only turns on if both devices decide it is too hot, or too cold.
Each can monitor the state of the other so that you can query the health of the control system even if one Pi is dead (if your router & ISP are still alive). You can connect GPIO out of one Pi to input of the other via opto-isolators (to avoid a failing Pi interfering with signals on a healthy Pi).

Monitor the relay switched lines so that faults can be detected. If the control thinks the relay is on check via an input that the switched line is powered.

If NOT turning on the HVAC is the worst case (e.g. pipes freeze if heating fails) you can connect the relays in parallel.

More elaborate redundant configurations are possible, but minimise single-point-of failure. Two Pis are very unlikely to fail at the same time, but don't power them from the same PSU or one failure can take them both out. Ideally put each on its own UPS.

RasPi is cheap enough to make a redundant system.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 12, 2013 8:42 pm

joan wrote:Interesting.

I propose a competition to find the cheapest design to switch off a relay when the Pi crashes. Still powered, just crashed.

Prize - 1 (one) virtual hug.
Hi Joan,
Coming from you that's a big compliment. Tks! :roll:
I will enter the competition (good idea!) to try to score that hug.

As soon as I have my parts I will see what I can do refining the current circuit idea.
I also hope there are smarter people (Tage?) that will enter and can point us to other solutions.

Tks,

Paul

User avatar
Tage
Posts: 287
Joined: Fri May 24, 2013 2:29 am
Location: St Thomas, Ontario Canada

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 12, 2013 9:08 pm

paulv, thanks for posting this, with code and everything. I am happy to see the circuit works.

the circuit can certainly be extended to include redundancy if that is important. in some of my projects I have had to include redundancy because verifying that the firmware part is failsafe can be very expensive, and it was easier to implement hardware circuitry that would disconnect if the micro controller totally flipped out, or if the program intentionally was made malicious.
I am not sure how to build a redundant hardware circuitry for this particular project though. using two Pi circuits could certainly be one way.

techpaul
Posts: 1512
Joined: Sat Jul 14, 2012 6:40 pm
Location: Reading, UK
Contact: Website

Re: Failsafe use of GPIO pin driving critical applications

Fri Jun 14, 2013 9:53 am

joan wrote:Interesting.

I propose a competition to find the cheapest design to switch off a relay when the Pi crashes. Still powered, just crashed.

Prize - 1 (one) virtual hug.
Depends how much of the system you want to be failsafe?

Just up to the relay coil? Guarantee the contacts will open/close or that the coil drive will operate?

Then comes the system issues, as in reliability/redundancy comes the issues that you have to look at the WHOLE system and what is more likely to fail, and what must NOT fail MOST of the circumstances, NEVER fail ever etc..

Taking the Relay controlliing a HVAC system as an example what is the point of worrying about the relay which you could guarantee, when you cannot guarantee the wiring from the relay to the HVAC. Is this multple sets of wires to multiple control points on the HVAC system? Just because a relay turns off does not mean the connections from the relay contacts are guaranteed. Neither is the HVAC, your relay does it job but the HVAC does not.

If the HVAC system is allowed to fail on mains power failure (supply to building or fuse), you wire your control system power to same feed. If you run with UPS how long must this run for.

I am reminded of a web data centre that had lots of VERY large UPS and generators in the MegaWatt range, to maintain power whatever happened. However one snag occurred, one of the battery banks caught FIRE, so ALL power had to be turned OFF so the fire could be put out. You have to specify what limits. Took them 4 hours to start bringing things back up.

As other poster has hinted at to guarantee a relay turns on or off, requires at least TWO relays as the relay may have failed. Often these sorts of problems have to be looked at as a WHOLE system problem and what are allowable failures.

Often aircraft systems are at least double circuits and often actuators. Some critical systems have at least 3 of everything even the mechanical stuff with feedback to monitor faults, where at least two guarantee normal operation. The wiring is usually run in different directions to minimise faults as well.
Just another techie on the net - For GPIO boards see http:///www.facebook.com/pcservicesreading
or http://www.pcserviceselectronics.co.uk/pi/

User avatar
joan
Posts: 14376
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Failsafe use of GPIO pin driving critical applications

Fri Jun 14, 2013 10:12 am

techpaul wrote: ...Often aircraft systems are at least double circuits and often actuators. Some critical systems have at least 3 of everything even the mechanical stuff with feedback to monitor faults, where at least two guarantee normal operation. The wiring is usually run in different directions to minimise faults as well.
Strangely enough I did a code review of the 7J7 (it used 3 lanes, I think C, Ada, and assembler (probably 68040).

techpaul
Posts: 1512
Joined: Sat Jul 14, 2012 6:40 pm
Location: Reading, UK
Contact: Website

Re: Failsafe use of GPIO pin driving critical applications

Fri Jun 14, 2013 10:34 am

joan wrote:
techpaul wrote: ...Often aircraft systems are at least double circuits and often actuators. Some critical systems have at least 3 of everything even the mechanical stuff with feedback to monitor faults, where at least two guarantee normal operation. The wiring is usually run in different directions to minimise faults as well.
Strangely enough I did a code review of the 7J7 (it used 3 lanes, I think C, Ada, and assembler (probably 68040).
BTW if you just wanted to have hardware that guaranteed that a coil drive signal did not get stuck, the simplest solution is a mix of hardware and software, preferably two threads (to avoid a stuck program in endless loop setting GPIO).

If you wanted to be sure that if anything failed (Pi side) and ignore any relay or relay wiring failures, have two GPIOs each drive nearly identical retriggerable monostables so each GPIO must be toggled off and on to maintain an output. The software must regularly go and toggle each GPIO, preferably each thread toggles a separate GPIO.

The outputs of the monostables drive two FETs wired in series so both monostables have to be maintained on to maintain the FETs ON and both FETs have to be ON to turn drive relay.

So any hardware or software failure is very likely to turn relay off. EXCEPT a FET shorting source to drain. Overspeccing the FETs and using TWO commutating diodes on the relay coil reduces this risk.

Each path must use its own components and not components in the same package.
Just another techie on the net - For GPIO boards see http:///www.facebook.com/pcservicesreading
or http://www.pcserviceselectronics.co.uk/pi/

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Fri Jun 14, 2013 12:27 pm

I don't think we need to use airplane type redundancy and fail safe systems for an HVAC system. I do not want to downplay anything said, but if it's that critical, I would not use a Pi (or two or three) in the first place. BTW, I looked at some commercial thermostat systems of reputable companies, and I could not find any of the fail safe systems offered here. Granted, the majority probably don't use software to drive the thermostat... ;) Any HVAC system itself has a number of fail safe systems built in (and that design is certified), so I think it's relatively safe to rely on those.

However, I'd rather be safe than sorry, so I'll do what is reasonable to design a system that lets me sleep well at night. My hope is that others can learn from our joint experiences, and chime in, so please keep the juices flowing and the comments coming.

Having said all that, a few points are well taken, and when I get my ordered parts I will build and test a circuit that uses two software threads to drive two GPIO channels driving two separate mono-stables to drive the HVAC relays.

This will hopefully be useful for those in the forums that want to use a Pi controlling something more serious.

So far, my web-based thermostat system has been running without the hardware fail safe circuit for about 5 months 24X7, and I only experienced 3 WIFI crop-outs, no Pi, Debian or Python crashes (yet), and it's a fairly complicated setup with a lot of bells and whistles for a web-based thermostat. I'm now in the process of making it more reliable, and in the process share my findings.

I'll report back soon.

Paul

techpaul
Posts: 1512
Joined: Sat Jul 14, 2012 6:40 pm
Location: Reading, UK
Contact: Website

Re: Failsafe use of GPIO pin driving critical applications

Fri Jun 14, 2013 12:36 pm

paulv wrote:I don't think we need to use airplane type redundancy and fail safe systems for an HVAC system. I do not want to downplay anything said, but if it's that critical, I would not use a Pi (or two or three) in the first place. BTW, I looked at some commercial thermostat systems of reputable companies, and I could not find any of the fail safe systems offered here. Granted, the majority probably don't use software to drive the thermostat... ;) Any HVAC system itself has a number of fail safe systems built in (and that design is certified), so I think it's relatively safe to rely on those.
That is a reasonable assumption and their systems work on a single failure and that in MOST cases it is not the end of the world if the system stays on, higher energy bill but not a safety problem.

Often they rely on buildings having people in them noticing something is wrong.
Having said all that, a few points are well taken, and when I get my ordered parts I will build and test a circuit that uses two software threads to drive two GPIO channels driving two separate mono-stables to drive the HVAC relays.
Send a private message if you want to dicuss this.
Just another techie on the net - For GPIO boards see http:///www.facebook.com/pcservicesreading
or http://www.pcserviceselectronics.co.uk/pi/

PiGraham
Posts: 3666
Joined: Fri Jun 07, 2013 12:37 pm
Location: Waterlooville

Re: Failsafe use of GPIO pin driving critical applications

Sat Jun 15, 2013 10:09 am

paulv wrote:I don't think we need to use airplane type redundancy and fail safe systems for an HVAC system. I do not want to downplay anything said, but if it's that critical, I would not use a Pi (or two or three) in the first place.
This is very true. Of course any safety critical system should not be implemented on hobby electronics with homebrew software!
paulv wrote: BTW, I looked at some commercial thermostat systems of reputable companies, and I could not find any of the fail safe systems offered here. Granted, the majority probably don't use software to drive the thermostat... ;) Any HVAC system itself has a number of fail safe systems built in (and that design is certified), so I think it's relatively safe to rely on those.
You should realise that commercial control systems are designed for reliability and extensively tested. I'm sure they do use software to control HVAC systems, and to manage engines and so on, but they engineer such system to be as reliable as possible within a budget.

If you want reliable control with home-brew kit you can't hope to achieve the inherent reliability levels of commercial equipment. What you can do, fairly easilly and cheaply, is use some techniques from safety critical systems to protect against lower levels of inherent reliability. You might cover a lot of failure modes with just a second Pi and couple of relays and little extra monitoring / fault reporting.

Since you can't employ a large team of designers, programmers an quality control engineers, and can't subject hundreds of systems to exhaustive testing, your solution is much more likely to fail than a commercial system.

Just doubling up can cover all sorts of failure modes.

It doesn't seem worth building an extra circuit to protect a control line against one assumed failure mode of halted PWM on a GPIO.
paulv wrote:This will hopefully be useful for those in the forums that want to use a Pi controlling something more serious.
I suggest people do not use Raspberry Pis to controlling anything more serious. Potential property damage and large fuel bills are one thing, personal safety is unacceptable.
For something no more serious it's a good topic to cover here, so thanks for posting.

paulv wrote:So far, my web-based thermostat system has been running without the hardware fail safe circuit for about 5 months 24X7, and I only experienced 3 WIFI crop-outs, no Pi, Debian or Python crashes (yet), and it's a fairly complicated setup with a lot of bells and whistles for a web-based thermostat. I'm now in the process of making it more reliable, and in the process share my findings.

I'll report back soon.

Paul
It sounds like a good and reasonably reliable system you have built. Good work.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Sun Jun 16, 2013 11:50 am

Below is a suggested revision for the circuit, to improve the reliability of this particular solution.
(Is there a better way to put a picture/diagram on this forum (not a link)?

Using a software threat or during an init, the PWM is started on one GPIO channel, similar to the solution earlier, but now using a MOSFET (M1) again. The relay I use is rated at 5 V, and there may be too much of a voltage drop when you use transistors, so I now use MOSFETs again.

The additional and second MOSFET (M2) is connected to another GPIO channel, and is driven by the program that determines the status of the HVAC system.

The relay will be switched on and off by the HVAC driven channel, however, if the CPU core gets stuck, the time-out driven M2 will kick in, and the relay is made powerless.

Because MOSFETs switch faster than a transistor, I now use a Shottkey diode for fly-back protection.

LTSpice did not have the parts for the Schottkey diode, relay and MOSFET's, so in the comments I placed the parts that I will use.

NOTE: I did not build the complete circuit yet, my missing parts will arrive tomorrow. As a caution, I do not know for sure if the 3V3 GPIO channel output can drive M2 in sufficient saturation, but in principle, Spice agrees.

So, with this simple circuit, I hope to have eliminated the possible situation where a GPIO output channel can stay high when the CPU core hangs/crashes/gets stuck. Of course, this is a proof of concept, and there are many other ways to implement this.

The added benefit of using a separate threat in Python to "gate" the functioning of the relay is that I could now use this thread to limit the amount of time the system is continuously heating or cooling as well, and do not have to build this timing loop in the main code, because that adds complexity and sources for errors. ( too many ideas, and so little time...)

I'd be very interested to see other/better examples accomplishing the same thing.

------------------------

To make the picture about my web-based thermostat system more complete, I already use the BMC watchdog timer to recover from a hung Debian, and have made provisions in my Python code to try to catch all run-time issues.

When I'm physically back in my other house, I will also offload my filesystem from the SD card. (See my other post ( http://www.raspberrypi.org/phpBB3/viewt ... 59#p351659) Although it's been running without problems for several months now, there is a lot of disk activity so I want to eliminate that source of potential problems as well.

Next step is to test this all out and see if I need to go further. As an example, we can still drive the reset pin (P6), and if all else fails, we can cycle the power. (I'll pull the power for my system from the HVAC 24V AC circuit and use a two-stage circuit to feed the Pi with 5V. The first stage regulates and lowers the voltage level so I can use a high-efficiency DC-DC (Buckley) convertor to avoid a lot of heat loss generation.)

It's nice to see several experts chime in, keep the juices flowing !
Thanks

Explanation of the LTSpice traces:
Blue is the gate of M1
Red is the gate of M2 (the actual driver for the HVAC system)
Green shows the cut-off (falling edge) when the heart-beat system kicks in
Attachments
GPIO-PWM-Failsafe Rev 2.jpg
GPIO-PWM-Failsafe Rev 2.jpg (53.87 KiB) Viewed 6654 times

User avatar
Gert van Loo
Posts: 2486
Joined: Tue Aug 02, 2011 7:27 am
Contact: Website

Re: Failsafe use of GPIO pin driving critical applications

Mon Jun 17, 2013 6:20 pm

Paulv send me a PM, asking me my to add my two cents worth.
I just don't have the time to read the whole thread as I have only so much time per day and a lot of demand for it.
The PWM in the PI is completely hardware driven. Once setup it runs independent of any SW.
As always it is a matter of probabilities. If I had to put my money I would set it all on the HW. It has the highest
likelihood that it will keep running whilst the power is on.
A seriously wild running program can write anywhere so it is theoretical possible that it accidentally writes a value to the PWM or to
the clock manager which makes it stop. But the probability of that is extremely! low. Especially combined with a watchdog timer
you should be able to build a very fail safe system. (Which is different from a reliable system).

User avatar
croston
Posts: 703
Joined: Sat Nov 26, 2011 12:33 pm
Location: Blackpool
Contact: Website

Re: Failsafe use of GPIO pin driving critical applications

Mon Jun 17, 2013 6:41 pm

I also received a PM. I wouldn't use the RPi+Linux for anything where you are concerned about reliability as in this case. I have a cheap and easy hardware+software solution that interfaces with a RPi - I will post technical details here tomorrow when I get chance. It's what I have designed for overnight/weekend water tank temperature control in a brewery tank that switches 26 amps. I can control it from anywhere using a smartphone. It is designed to handle power outages that might take out a RPi (e.g. SD card corruption on a forced power cycle).

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Tue Jun 18, 2013 4:47 am

Well, according to Gert, the SOC based PWM is hardware based, so you cannot use that as a CPU hart-beat.
( we're talking here about the PWM that is available on channel 18 of the GPIO connector, and is labeled PCM_CLK)

I was using the software version of the PWM, from the RPi.GPIO library, but got a little nervous about using that. I have not gotten confirmation if that is implemented in software only, so I decided to run a little test.

I used the Python program as listed above in this post, just turning on the PWM on GPIO channel 22, pin 15.
I then opened another session and used a "forkbomb" to "crash" Debian. I use "crash", because the CPU continues to run, but is running out of fork resources, the system cannot start new processes and gets unresponsive. However, the PWM pulses continued to come out of the Pi. Whoa! I think that is because the process that got it started continued to run, despite the fact that the Pi got totally unresponsive for any newer tasks. So this is also no good for a failsafe method.

Back to the drawing board. I decided to write my own little program to create pulses to mimic a hart beat.

Code: Select all

#!/usr/bin/env python2.7

import RPi.GPIO as GPIO
from time import sleep

HVAC = 22 # GPIO channel 22 = pin 15
GPIO.setmode(GPIO.BCM)
GPIO.setup(HVAC, GPIO.OUT)

try:
    while True:
        GPIO.output(HVAC, GPIO.HIGH) # channel high
        sleep (0.01) # 10ms
        GPIO.output(HVAC, GPIO.LOW) # falling edge triggers monostable
        sleep(0.1) # 100ms
except KeyboardInterrupt:
    GPIO.cleanup()       # clean up GPIO on CTRL+C exit
GPIO.cleanup()           # clean up GPIO on normal exit
On my digital scope, I could see the pulses, and I can report that the timing of the sleep function is very precise, even with these small numbers. BTW, techpaul, another forum contributor, recommended using a falling edge to trigger a monostable (single-shot), because it is inherently safer.

In any case, when I tried the fork bomb again, the first time the GPIO pin got stuck HIGH. Whoops, no good! After a few more tries however, the pulses continued to come out while the Pi virtually crashed. This must be due to the same issue, once a process starts, it will continue, even if there are no more resources to start new childs. Again this is no good!

So, if the Pi gets unresponsive when no more tasks/child processes can be added, let's play along with that game.
If we use a separate threat to pulse the channel, it should stop with the forkbomb.
Proof of concept:

Code: Select all

#!/usr/bin/env python2.7

from time import sleep
from threading import Thread
import RPi.GPIO as GPIO

HVAC = 22 # GPIO channel 22 = pin 15
GPIO.setmode(GPIO.BCM)
GPIO.setup(HVAC, GPIO.OUT)

class ThreadClass(Thread):
  def run(self):
    GPIO.output(HVAC, GPIO.HIGH) # channel high
    sleep (0.01) # 10ms
    GPIO.output(HVAC, GPIO.LOW) # falling edge triggers monostable

# main program
try:
    while True:
        # pulse the bi-stable
        t = ThreadClass()
        t.setDaemon(True)
        t.start()
        sleep(.1)

except KeyboardInterrupt:
    GPIO.cleanup()       # clean up GPIO on CTRL+C exit
GPIO.cleanup()           # clean up GPIO on normal exit

And yes, this works! I have tried it several times, and every time the pulse went away, the GPIO channel went to LOW.
BTW, the threading added no visual timing changes.

So now we can feed this pulse into the circuit I used above, or you can use another method (coming from techpaul) to create a monostable for example by using a 74XX123 or a 555 to create a single shot pulse based on a trigger. If you really want to be safe, use two separate GPIO channels and two different threats to drive them and of course two FET's in series to drive the relays. If one chain fails, the relays will be powerless.

Now that we seem to have a solution, let's make this a bit more realistic. In my own situation, a thermostat, my main loop takes about 30 seconds minimum, and every 5 minutes, it takes about a minute. This is because I do some analysis, report data to the web site and prepare graphs etc. In any case, when the system needs to cool or heat, this is normally a process of minutes, so if I modify my program to send out a pulse every 30 seconds when heating or cooling is required, and I use a monostable (I must use a 555 or so now to create these long periods) that is programmed for say a minute and 30 seconds, the HVAC will turn off after that period if the pulses are no longer coming. The 1 1/2 minute allows for the longer loop time and also allows for Debian housekeeping and others delays introduced outside of my program. This is good enough for me.

I cannot find a 555 in my stash, so I can't build the monostable circuit yet, but as soon as I have, I will publish the results here.

I must say that I learned a lot and several smart forum members send me private messages with lots of good inputs that I tried to incorporate here. This is a lot of fun, and a great learning experience, all because of this little computer...

Back to you folks: Inputs? Comments? Better ideas?

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Tue Jun 18, 2013 1:25 pm

Here is the updated circuit I have in mind:
(still not built and tested yet)
Thanks techpaul for spotting the error.
Attachments
failsafeGPIO.png
failsafeGPIO.png (27.05 KiB) Viewed 6405 times
Last edited by paulv on Tue Jun 18, 2013 3:33 pm, edited 1 time in total.

PiGraham
Posts: 3666
Joined: Fri Jun 07, 2013 12:37 pm
Location: Waterlooville

Re: Failsafe use of GPIO pin driving critical applications

Tue Jun 18, 2013 1:52 pm

I'm not sure you gain anything in reliability by putting another circuit in series. What are the relative probabilities of the various failure modes? How reliable is your 555 circuit (it could fail)? Is the additional probability of failure due to the 555 circuit outweighed by the probability of the set of failure modes that will stop the PWM? What proportion of possible failures can you cover with this device?

techpaul
Posts: 1512
Joined: Sat Jul 14, 2012 6:40 pm
Location: Reading, UK
Contact: Website

Re: Failsafe use of GPIO pin driving critical applications

Tue Jun 18, 2013 2:32 pm

paulv wrote:Here is the circuit I have in mind:
(not built and tested yet)
That will give a SINGLE pulse of around 90ms (time on is 1.1 RC), to be able to RE-trigger the monstable to keep the output high you need to add a transistor/FET across the capacitor and example is at http://www.doctronics.co.uk/555.htm#retriggering
Just another techie on the net - For GPIO boards see http:///www.facebook.com/pcservicesreading
or http://www.pcserviceselectronics.co.uk/pi/

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Tue Jun 18, 2013 4:00 pm

PiGraham wrote:I'm not sure you gain anything in reliability by putting another circuit in series. What are the relative probabilities of the various failure modes? How reliable is your 555 circuit (it could fail)? Is the additional probability of failure due to the 555 circuit outweighed by the probability of the set of failure modes that will stop the PWM? What proportion of possible failures can you cover with this device?
PiGraham, I think you may be missing the point I'm trying to make in this post again. The probability of a software related crash either in my application code (very likely), in Debian (probable) or in the Pi components and peripherals (very probable) in my opinion are significantly larger than a possible failure in a few added, well proven components. I hope you can agree with me that with the addition of only a few parts, even a remotely possible hang-up of a GPIO channel driving something important can be avoided. That to me increases the reliability and safety of my thermostat system and that's really what I am trying to solve.

PiGraham
Posts: 3666
Joined: Fri Jun 07, 2013 12:37 pm
Location: Waterlooville

Re: Failsafe use of GPIO pin driving critical applications

Tue Jun 18, 2013 9:23 pm

Hey, no problem. I wouldn't do it that way but it's your design and your property. Do what gives you peace of mind.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 19, 2013 3:34 am

Graham, how would you do it then? I have searched around somewhat but could not find any other practical solutions. If you have a better method, or know of one, can you share that with us? According to the number of times some of the diagrams in this post have been looked at, there seems to be a lot of interest.

User avatar
aTao
Posts: 1087
Joined: Wed Dec 12, 2012 10:41 am
Location: Howlin Eigg

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 19, 2013 6:13 am

Surely the simplest, cheapest and most robust is to use mechanical thermostats that are connected to override the RPi
>)))'><'(((<

PiGraham
Posts: 3666
Joined: Fri Jun 07, 2013 12:37 pm
Location: Waterlooville

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 19, 2013 7:21 am

paulv wrote:Graham, how would you do it then? I have searched around somewhat but could not find any other practical solutions. If you have a better method, or know of one, can you share that with us? According to the number of times some of the diagrams in this post have been looked at, there seems to be a lot of interest.
It depends!

For the sake of general discussion, not as criticism of you, I offer some more thoughts on the topic:

I agree with you that one of the more likely failure modes would be in my code, since only I have seen it, and it's hard to test for things I haven't already foreseen and designed for.
Next most likely is hardware I built, for similar reasons.
Next is software other individuals wrote, such as GPIO libraries. These may not have had the benefit of code review and thorough independent testing, but they have been used by others for a range of uses.
Next is the OS. That has been coded and had some oversight and diverse testing by lots of people and has thousands or millions of hours of runtime behind it.
Next is the Pi hardware.
Next, purely because of the simplicity of it, is the wiring to connect Pi to HVAC. I assume this is cable and relays and isn't very sensitive to static electricity, RF interference, voltage spike and the like.

My top priority would be diagnostic reporting. If something fails I need to know promptly so that it can be fixed before serious damage occurs. I think two Pis is a good idea, each monitoring the system, with independent power (maybe a UPiS or other battery add-on).
Assuming we have a hard-line internet connection I'd hook both Pis to that, but I'd have a 3G dongle as well for backup.
If mains power fails I have power and comms to report the fact. If it's middle of winter and the pipes are about to freeze because the power is out (Utility, fusebox, etc) I get to know in time to save the day.

Alternatively I'd have another system regularly poll the site and alert if it can't communicate.

I'd run the same monitoring and control software on both Pis because life is too short to do this twice. That leaves a door open to logical errors in my software design, but it protects somewhat against race conditions, deadlocks, stack corruption, unusual signal timing and so on.

Where the Pis have to control the HVAC I would make voltage-isolated connections (Opto isolators / relays) in parallel or series, depending on whether open circuit or close circuit faults are more severe. E.g. For low temperature frost protection I'd put relays in parallel.

If there is comms control (RS232, TCP/IP) I can't avoid single point of failure, but I might make one Pi monitor what the other Pi is doing by reading Rx & Tx exchange with the HVAC. If the two disagree by a large margin that gives me a reportable fault condition.

I can't say how far to go. It depends how safe you want to be. What would a failure cost you? What would you pay to reduce that risk by what margin? I think I might buy a second Pi and a 3G dongle for the extra peace of mind. If I have the control code adding some additional monitoring isn't too difficult.

Generally, the more parts there are the more things there are to go wrong, so be careful not to add single points of failure. If your protection circuit can bring down the system if it fails you may not have improved things.

Watch out for silent failures. If a protection system can fail without it being detected then you may be without an assumed level of protection for years. When the system it was supposed to protect eventually fails the system has no protection and fails.

Watch activity, not single states. If you want to monitor a control line is on when it is supposed to be on also check that it is off when it's supposed to be off. That way you pick up if a relay contact sticks or a wire falls off or a GPIO output transistor blows.

paulv
Posts: 558
Joined: Tue Jan 15, 2013 12:10 pm
Location: Netherlands

Re: Failsafe use of GPIO pin driving critical applications

Wed Jun 19, 2013 7:43 am

Graham,

That is a very good summery of what needs to be considered for the overall reliability of a system.
With this it's easier for those looking in this forum for ideas or help to make check-marks, trade-offs and do a system design up-front. Most of us are hobbyists, I am one, with only limited industrial design experience, learning from others, which is what the Pi and this forum is all about.

Notice that I changed the title of the post to Failsafe(r)...

techpaul
Posts: 1512
Joined: Sat Jul 14, 2012 6:40 pm
Location: Reading, UK
Contact: Website

Re: Failsafe(r) use of GPIO pin driving critical application

Wed Jun 19, 2013 8:50 am

I would agree that in general multiple systems and multiple UPS for real failsafe.

In this case as this is one HVAC system, which will have inbuilt modes that put the unit safe when power fails (no doubt always off), powering one or more Pis from the same mains feed as the HVAC is adequate as neither can do anything to the HVAC during power fail. Running the Pi(s) from completely
separate mains feeds would be pointless and introduce more points of failure.

Adding a UPS to report main failure and time would be useful but it depends if you are trying to make a complete failsafe(r) system or just certain aspects.
Just another techie on the net - For GPIO boards see http:///www.facebook.com/pcservicesreading
or http://www.pcserviceselectronics.co.uk/pi/

Return to “Automation, sensing and robotics”