davidcoton wrote: ↑Sun Aug 27, 2017 4:00 pm
You've taken on a complex project, and you're doing it to a high standard. I wouldn't be surprised if it has been six months full time to get where you are with hardware and software. Sit down, take a cup of tea/coffee, and just look at what you have achieved. Then prioritize what's left. Decide what
must be done before deployment, what is a nice to have, and what is definitely not necessary now (though it might be something for next year). Do the essentials, cherry pick the easy bits of the middle group, and sideline the rest. Test and deploy! (Sorry, that is too simple. Test, debug and deploy!)
Thank you for the kind words - I did take a break and went for a cold beer outside (glorious weather here!), which helped clear my thoughts and I came up with exactly the kind of priority list you suggested.
1) Connectivity. The biggest problem I'm having is that the OpenVPN client on the AirLink RV50 is severely out of date. In fact it is
five years out of date, at version 2.1. This is outrageously bad considering that this is a
current model of a $700 professional device, which is widely used in mission critical applications such as first respondent vehicles. And it's not like I'm running an outdated firmware; I'm on version 4.8.1.006, released on July 16th this year. Yet despite spending the best part of two weeks I have been unable to get it to connect to my v2.3 OpenVPN server. Repeated inquiries in the Sierra Wireless support forums about this issue, going back to August 8th, have gone completely unanswered, and no other support options are offered by them unless you pay them loadsamoney. I need to make some decision on how to deal with this, and see a few different options: either downgrade the server to a compatible version (i.e. 2.1), with all that means in terms of manually building it, lack of updates, and known and serious vulnerabilities. Alternatively, give up on the idea of the router being the VPN endpoint and install OpenVPN on the Pi instead, with all that means in terms of increased CPU usage (presumably the RV50 has encryption hardware to deal with TLS etc) and difficulty getting other machines behind the router to use the VPN. A third option might be to look at
other VPN technologies - the RV50 also claims to support IPSec and GRE, though if its OpenVPN "support" is anything to go by I might just be opening another can of worms. A fourth option is to cut my losses on the Sierra Wireless router alltogether, and replace it with something else, though this is not appealing considering the money spent (£400) and the time and effort invested in the physical mounting of it, as well as writing the software to communicate with it (for example I make use both of its SMS gateway and its GPS -> UDP service). Regardless of how I deal with this, it really must be dealt with before the system can be deployed; without a functioning VPN I will have no way "in" from outside.
2) Time. Or rather the lack of it. In a very real sense. One nice thing about the RV50 is the built in GPS service, which can easily be configured to send NMEA GPS data over UPD to any listening clients. I have successfully set up gpsd to listen to this, and I get a good fix etc in gpsmon -
but, I am unable to get ntpd to read the shared memory provided by gpsd (or maybe it is that gpsd doesn't populate the shm correctly). I have spent
weeks on this issue. I have posted questions and detailed descriptions of my problems and the symptoms to the gpsd and ntpd mailinglist, which in typical fashion have received responses which
completely disregard the information I have provided, in terms of configuration files and log output. I have also spent considerable time trying to get the new ntpd driver type 46 (JSON) to work without luck. It is a curious shortcoming that the RV50 does not provide an NTP server - most routers I've worked with in the past have included this functionality. But that's beside the point; it
is capable of providing
highly accurate time data via its GPS -> UDP interface, which
should be possible to use to set the clock on the connected Raspberry Pi. I really do not think what I'm trying to do here is too far fetched; the correct time is
right there in the NMEA data stream, and
it is picked up correctly by gpsd. Why this obsession with GPS time? Well, it is
the only reliable [insert mad laugh here] time source available to this system; anything can happen to the mobile network, and the internet connection, and an on-board RTC, while provided by the Monarco Hat, relies on its own button cell battery which will die at
some point, and it
will drift over time. An RTC is great and useful, but only if you can periodically re-sync it with a
reliable and accurate time source. This being for a boat, I cannot imagine any scenario short of a catastrophic failure (or WWIII) that would make a GPS fix unattainable. Without accurate time, automatic updates will fail, the VPN will fail, and the data logging (which is 75% of what this project is about) will fail. This needs to be
fixed before the system can be deployed. I am now working on writing my own parser for the NMEA UDP datagrams, and my own function for updating the system time from this data. Yes, if someone handed me a gun I'd put it to my own forehead and pull the trigger quite happily.
3) Power. When I set out on this project, I tried to plan from the bottom up, and made it the first priority to ensure that the system would have its own reliable power source, which could be monitored and would notify me of any power related issues well ahead of them becoming a problem. I spent
a lot of time researching my options here, but eventually settled on the OpenUPS from mini-box as the best - albeit pricey - alternative. This is a very clever battery management system, which can operate on a wide range of input voltages, and provides full UPS-style monitoring over USB, compatible with NUT and upsd. It really is the bee's knees when it comes to computer attached power management systems for anyone who builds bespoke "always on" systems and need UPS functionality. I also chose to go with a hefty, 17Ah, sealed lead acid battery as the failover power source, so that I would have plenty of time to "talk" to the system (and it to me!) in the event of some kind of power failure; no less than 24 hours availability when all other power has been lost.
But. The damn thing wouldn't work properly. I had managed to find a second-hand board on eBay for "only" £50, but it would go into a curious pulsating mode, continuously switching between "on line", "charging" and "on battery" (and I do mean like every ten seconds or so). I battled for a long time with the many configuration parameters but could not get it to stop doing this. Some months later, having spent yet another few days researching alternatives, and having had
no response from mini-box to my detailed and polite support requests, I came to the same conclusion I had previously; if you want to build a custom "always on" system backed by its own UPS there is precisely
one product on the market which ticks all the boxes, and it is the OpenUPS board from mini-box. Maybe there was just a fault with the second-hand board I had purchased? I swallowed my pride and ordered another board, this time brand new, at a cost of £100. Guess what?
I'm having precisely the same issues with the new board. It goes into some kind of mental break-down mode where it refuses to float charge the SLA, while constantly switching status as explained above, spamming the syslog, and spamming me with emails (unless I turn off the notifications), until the battery has been drained to the point where it will start a "bulk" charge. It then charges the battery up quite happily, after which it
does apply a float charge, and the system appears to be stable. But if I remove external power for even the briefest moment, this fluctuating behaviour returns, and the cycle repeats. I
cannot deploy this system until I have confidence in its power supply. I have just spent
another two days trying to get to the bottom of this, re-reading the manual very carefully (I'm using the defaults for an PBSO4 SLA anyway), and logging and looking at
days' worth of system metrics. Yet anything resembling a solution still eludes me, and mini-box are as silent as ever.
These are the three "show stopper" issues I am facing, and all three have some important properties in common:
- They are all issues relating to "off the shelf" technologies which should just work - I'm not re-inventing the wheel here!
- The equipment involved is expensive and has been purchased only after careful research, precisely because I hoped it would function well
- No constructive assistance of any kind has been offered by any of the people or companies behind any of the hardware and software involved, despite my repeated requests
It has taken another hour of my time to write up this post - an hour during which I managed to burn one of the frozen pizzas I have been forced to live on lately, due to not having time to cook - so please be gentle with your criticism.