da7ashark
Posts: 9
Joined: Fri Jan 08, 2021 3:50 am

RaspberryPi 4B drops from network

Mon Sep 13, 2021 6:26 pm

Good afternoon everyone

I am working on a project with some datacenters and we are using the raspberry pi to interface with some custom hardware to automate certain tasks and monitor the environment.

Our first datacenter we got to work consistently, with no hiccups, has been on consistently for most of this year. We recently installed two new setups and are using two more raspberry pi's. The problem with the two new setups is that the raspberrypi's will lose there connection to the local network, and assign itself a local IP. The IP when functional looks something like 192.168.50.154, when the internet drops, it will be 169.254.111.96, which tells me it does not have a connection to the nearby switch. It will either do this randomly, or after a scheduled restart. We have the Pi setup as a hotspot, so when this happens, we can connect and restart the pi with SSH. Upon restart we almost always get back onto the local network. Something tells me this is a DHCP problem.

The new setups are connected to daisy chained switches from the first datacenter, not sure how important this is.

We are running a web server and use docker to manage all the software code, and hardware related code. We use two I2C Busses, and Two SPI busses on the custom hardware.

All datacenters, including the one running all year have the exact same software installed.

Kernel Version:

Code: Select all

Linux raspberrypi 5.10.17-v8+ #1421 SMP PREEMPT Thu May 27 14:01:37 BST 2021 aarch64 GNU/Linux
Service List:

Code: Select all

[ + ]  alsa-utils
 [ + ]  avahi-daemon
 [ - ]  bluetooth
 [ - ]  console-setup.sh
 [ + ]  cron
 [ + ]  dbus
 [ + ]  dhcpcd
 [ + ]  dnsmasq
 [ + ]  docker
 [ + ]  dphys-swapfile
 [ + ]  fake-hwclock
 [ + ]  glances
 [ + ]  hddtemp
 [ + ]  hostapd
 [ - ]  hwclock.sh
 [ - ]  keyboard-setup.sh
 [ + ]  kmod
 [ + ]  lighttpd
 [ + ]  lm-sensors
 [ + ]  netfilter-persistent
 [ + ]  networking
 [ - ]  nfs-common
 [ + ]  procps
 [ + ]  raspi-config
 [ ? ]  rng-tools
 [ - ]  rpcbind
 [ - ]  rsync
 [ + ]  rsyslog
 [ + ]  ssh
 [ - ]  sudo
 [ + ]  triggerhappy
 [ + ]  udev
 [ + ]  vnstat
 [ - ]  x11-common
Here is a log of when the IP is dropped:

Code: Select all

Sep  8 04:01:19 raspberrypi avahi-daemon[446]: Joining mDNS multicast group on interface eth0.IPv6 with address fe80::c72f:369e:f523:ed43.
Sep  8 04:01:19 raspberrypi avahi-daemon[446]: New relevant interface eth0.IPv6 for mDNS.
Sep  8 04:01:19 raspberrypi avahi-daemon[446]: Registering new address record for fe80::c72f:369e:f523:ed43 on eth0.*.
Sep  8 04:01:19 raspberrypi dhcpcd[449]: eth0: soliciting an IPv6 router
[b]Sep  8 04:01:19 raspberrypi dhcpcd[449]: eth0: rebinding lease of 192.168.50.175
Sep  8 04:01:20 raspberrypi systemd[1]: systemd-rfkill.service: Succeeded.
Sep  8 04:01:24 raspberrypi dhcpcd[449]: eth0: probing for an IPv4LL address
Sep  8 04:01:24 raspberrypi dhcpcd[449]: eth0: DHCP lease expired
Sep  8 04:01:24 raspberrypi dhcpcd[449]: eth0: DHCP lease expired
Sep  8 04:01:24 raspberrypi kernel: [   16.107923] vc4-drm gpu: [drm] Cannot find any crtc or sizes
Sep  8 04:01:24 raspberrypi dhcpcd[449]: eth0: soliciting a DHCP lease
Sep  8 04:01:29 raspberrypi dhcpcd[449]: eth0: using IPv4LL address 169.254.40.170[/b]
Sep  8 04:01:29 raspberrypi avahi-daemon[446]: Joining mDNS multicast group on interface eth0.IPv4 with address 169.254.40.170.
Sep  8 04:01:29 raspberrypi avahi-daemon[446]: New relevant interface eth0.IPv4 for mDNS.
Sep  8 04:01:29 raspberrypi avahi-daemon[446]: Registering new address record for 169.254.40.170 on eth0.IPv4.
Sep  8 04:01:29 raspberrypi dhcpcd[449]: eth0: adding route to 169.254.0.0/16
Sep  8 04:01:29 raspberrypi dhcpcd[449]: eth0: adding default route


One weird thing is that if the pi drops off the network, it will reconnect and find an IP if the following command is run

Code: Select all

 sudo dhcpcd -help
It produces the following

https://imgur.com/a/fa0D6lh

and suddenly we have access to the pi again.

Things we have tried:

adding

Code: Select all

denyinterfaces *veth
to

Code: Select all

/etc/dhcpcd.conf

Power supply is NOT an issue, logged the power and we do not see under voltage warnings in the kernel log

Disabling Energy Efficient Ethernet (EEE)

Anyone know what is going on?

We are trying to use crontab to monitor the network, but I would like to know the root cause.

bls
Posts: 1690
Joined: Mon Oct 22, 2018 11:25 pm
Location: Seattle, WA

Re: RaspberryPi 4B drops from network

Mon Sep 13, 2021 6:58 pm

In the log:

Code: Select all

Sep  8 04:01:19 raspberrypi dhcpcd[449]: eth0: rebinding lease of 192.168.50.175
Sep  8 04:01:20 raspberrypi systemd[1]: systemd-rfkill.service: Succeeded.
Sep  8 04:01:24 raspberrypi dhcpcd[449]: eth0: probing for an IPv4LL address
Sep  8 04:01:24 raspberrypi dhcpcd[449]: eth0: DHCP lease expired
For starters:

Why is the DHCP lease expiring in 5 seconds? What is the DHCP lease set to on the network?, and what are other systems getting?

What have you added to /etc/dhcpcd.conf?
Pi tools:
Quickly and easily build customized-just-for-you SSDs/SD Cards: https://github.com/gitbls/sdm
Easily run and manage your network's DHCP/DNS servers on a Pi: https://github.com/gitbls/ndm
Easy and secure strongSwan VPN installer/manager: https://github.com/gitbls/pistrong
Lightweight Virtual VNC Config: https://github.com/gitbls/RPiVNCHowTo

epoch1970
Posts: 6951
Joined: Thu May 05, 2016 9:33 am
Location: France

Re: RaspberryPi 4B drops from network

Mon Sep 13, 2021 7:03 pm

da7ashark wrote:
Mon Sep 13, 2021 6:26 pm
Things we have tried:

adding

Code: Select all

denyinterfaces *veth
to

Code: Select all

/etc/dhcpcd.conf
That one works better written

Code: Select all

denyinterfaces veth*
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

da7ashark
Posts: 9
Joined: Fri Jan 08, 2021 3:50 am

Re: RaspberryPi 4B drops from network

Mon Sep 13, 2021 7:04 pm

In this case, it is soliciting a lease upon a scheduled restart. The lease lasts for about 24h.

At midnight our system compresses a bunch of data and does a fresh restart.

dhcpcd.conf:

Code: Select all

# RaspAP default configuration
denyinterfaces veth*
hostname
clientid
persistent
option rapid_commit
option domain_name_servers, domain_name, domain_search, host_name
option classless_static_routes
option ntp_servers
require dhcp_server_identifier
slaac private
nohook lookup-hostname

# RaspAP wlan0 configuration
interface wlan0
static ip_address=10.3.141.1/24
static routers=10.3.141.1
static domain_name_server=9.9.9.9 1.1.1.1


da7ashark
Posts: 9
Joined: Fri Jan 08, 2021 3:50 am

Re: RaspberryPi 4B drops from network

Mon Sep 13, 2021 7:06 pm

epoch1970 wrote:
Mon Sep 13, 2021 7:03 pm
da7ashark wrote:
Mon Sep 13, 2021 6:26 pm
Things we have tried:

adding

Code: Select all

denyinterfaces *veth
to

Code: Select all

/etc/dhcpcd.conf
That one works better written

Code: Select all

denyinterfaces veth*
That was a silly typo on my end, I posted the conf file in a reply just now.

epoch1970
Posts: 6951
Joined: Thu May 05, 2016 9:33 am
Location: France

Re: RaspberryPi 4B drops from network

Mon Sep 13, 2021 9:52 pm

bls wrote:
Mon Sep 13, 2021 6:58 pm

Why is the DHCP lease expiring in 5 seconds? What is the DHCP lease set to on the network?, and what are other systems getting?
Because of this I guess
https://manpages.debian.org/buster/dhcpcd5/dhcpcd.8.en.html wrote: -y, --reboot seconds
Allow reboot seconds before moving to the discover phase if we have an old lease to use. Allow reboot seconds before starting fallback states from the discover phase. IPv4LL is started when the first reboot timeout is reached. The default is 5 seconds. A setting of 0 seconds causes dhcpcd to skip the reboot phase and go straight into discover.
Something happens before renewal time I suppose. Perhaps the switch drops the ball, or there is a firewall active on the Pi (most likely w/ docker).
Or the dhcp server refuses to offer a lease (?)

A little bit of exploration within the DHCP server and the switch would probably help.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

da7ashark
Posts: 9
Joined: Fri Jan 08, 2021 3:50 am

Re: RaspberryPi 4B drops from network

Tue Sep 14, 2021 5:53 pm

epoch1970 wrote:
Mon Sep 13, 2021 9:52 pm
bls wrote:
Mon Sep 13, 2021 6:58 pm

Why is the DHCP lease expiring in 5 seconds? What is the DHCP lease set to on the network?, and what are other systems getting?
Because of this I guess
https://manpages.debian.org/buster/dhcpcd5/dhcpcd.8.en.html wrote: -y, --reboot seconds
Allow reboot seconds before moving to the discover phase if we have an old lease to use. Allow reboot seconds before starting fallback states from the discover phase. IPv4LL is started when the first reboot timeout is reached. The default is 5 seconds. A setting of 0 seconds causes dhcpcd to skip the reboot phase and go straight into discover.
Something happens before renewal time I suppose. Perhaps the switch drops the ball, or there is a firewall active on the Pi (most likely w/ docker).
Or the dhcp server refuses to offer a lease (?)

A little bit of exploration within the DHCP server and the switch would probably help.
I could not access our switch, I have to wait on our IT get the credentials but this seems like the most likely case. I noticed something to that the network dropping was most consistent after the scheduled restarts. So maybe there is a window of time when it is rebinding the lease and the switch is unresponsive or something to do with DHCP settings in the Pi or the switch.

I was able to recreate the same logs and problem when I would try to hot swap ethernet cables, If I disconnected one cable, wait a minute or so, and then reconnect a different cable, the pi would not have a IP I could access until I ran the command

Code: Select all

sudo dhcpcd

epoch1970
Posts: 6951
Joined: Thu May 05, 2016 9:33 am
Location: France

Re: RaspberryPi 4B drops from network

Tue Sep 14, 2021 6:30 pm

Well that is a bit puzzling.
I can understand a switch would quarantine a port for a good while until it goes online. But the thing is, dhcpcd retries constantly when it cannot get an IP address. The maximum retry delay is 90 secs IIRC. So the machine should get a DHCP IP address, at least after a little while.

The only reason I can see for an interface to get an IPv4LL address and stick with it, is if dhcpcd has died (or exited). On RaspiOS it doesn't exit by default. I have seen Docker overwhelm dhcpcd and cause it to die. But with the appropriate "denyinterfaces" patterns, that is not an option.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

da7ashark
Posts: 9
Joined: Fri Jan 08, 2021 3:50 am

Re: RaspberryPi 4B drops from network

Wed Sep 15, 2021 4:05 pm

epoch1970 wrote:
Tue Sep 14, 2021 6:30 pm
Well that is a bit puzzling.
I can understand a switch would quarantine a port for a good while until it goes online. But the thing is, dhcpcd retries constantly when it cannot get an IP address. The maximum retry delay is 90 secs IIRC. So the machine should get a DHCP IP address, at least after a little while.

The only reason I can see for an interface to get an IPv4LL address and stick with it, is if dhcpcd has died (or exited). On RaspiOS it doesn't exit by default. I have seen Docker overwhelm dhcpcd and cause it to die. But with the appropriate "denyinterfaces" patterns, that is not an option.
Yes the dhcpcd service gets killed and never comes back on, as a hotfix we wrote a little script on crontab that detects when this problem occurs and just runs 'sudo dhcpcd' and everything works fine and the pi finds the network again. Something tells me there is a link between dhcpcd going down and the scheduled restart. When there is no scheduled restart the device stays on consistently. Do you have any other ideas of the root cause of this? I dont have much info on the scheduled restart but I think its just a crontab that runs the command 'sudo reboot now'

epoch1970
Posts: 6951
Joined: Thu May 05, 2016 9:33 am
Location: France

Re: RaspberryPi 4B drops from network

Wed Sep 15, 2021 5:33 pm

I have no idea. For me dhcpcd is quite reliable. The first time I saw it quit unexpectedly (segv) was when I installed Docker and used software defined networks.
You can look in /var/log/syslog if you haven't done that yet. Adding "debug" in dhcpcd.conf might increase traces a bit.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

Return to “Networking and servers”