For several weeks I've been chasing an intractable problem with VNC sessions hanging. They generally recover after around 30 seconds. I wonder if it rings any bells with anyone round here?
We have a bunch of headless classroom Pis on their own NATted subnet, which is connected to the school network via a Shorewall firewall. The firewall allows SSH and VNC from the school network to the Pis and HTTP and NTP the other way. The Pis are controlled using VNC or PuTTy from PCs on the school network.
That's been working fine for 3 years, until a month or so ago. Evidently something has changed but I'm at a loss to see what. I was doing some changes to the firewall for a 3rd NIC but I'm pretty sure I reverted them when I couldn't achieve what I wanted. The Network Manager can't think of anything that's changed which might cause it. I've taken hundreds of MB of Wireshark captures but they only seem to deepen the mystery.
Typically, you can see lots of VNC traffic from a Pi to a PC as it sends a screen image, with TCP ACKs coming back from the PC. Suddenly, with unacknowledged data, a FIN ACK packet arrives at the firewall from the PC. The Pi continues to send a few more packets which Wireshark labels as retransmissions, and which shortly after are rebuffed by an ICMP Host Unreachable, or sometimes Network Unreachable from the firewall. If I have a ping to the PC running on the firewall, both the echo and echo reply packets cease, yet the Wireshark is still seeing random ARP traffic on the school network. On recovery, the firewall first of all broadcasts a whole bunch of gratuitous ARP packets on the school network for all its school network IP addresses (all but one being NATted Pi addresses).
The FIN ACK apparently from the VNC client as the first sign of trouble would indicate a VNC issue - maybe something running out of buffer space, yet the Host/Network Unreachable would seem to point a finger at the firewall.
Totally baffled, I'm planning to rebuild the firewall after the end of term, but I can't be sure that that will solve the problem. Inspiration, anyone?
Regards - Philip