I'm testing a few Raspberry Pi's for 'production' environments... examples of the usage are for remote control of heating systems (now two already deployed), and for access control (someone presents a token and the Pi decides whether to allow access or not).
The requirement I have is that the systems have to be 100% rock-solid reliable. This morning, for some unknown reason, the Pi I have here controlling my heating system just stopped responding. I couldn't reach the webpage where I control my heating, nor could I SSH into the box. When I came to look at the Pi, I could see that the blue light on the Edimax wifi dongle was continuously lit, instead of flashing as is normal. After killing the power and restarting, I could get into the box. From the syslog, I could see that, for some reason, the Pi rebooted at 5am this morning. Further investigation of the syslog has not revealed any obvious reason for why the thing rebooted nor for why the thing appeared to have 'locked up' after the reboot.
Perhaps significant is that I did update the firmware yesterday on my home-heating-controlling Pi, but after the update everything appeared to be fine. The second Pi I have controlling the heating at a client site has not yet been updated. There have also been problems with 'reliability' of that system too, but some of those problems have for sure been caused by the client being careless with the cables, placement of the Pi etc.
So I guess I have a couple questions:
In general, how reliable are you all finding the RaspberryPi in 'high-availability', 'long-uptime' environments?
Specifically, any suggestions on how better to track down the problem that manifested itself this morning?
Thanks for any and all help!
Kind regards,
YoungJules
