bls
Posts: 289
Joined: Mon Oct 22, 2018 11:25 pm
Location: Seattle, WA

Need help debugging wlan0 'carrier lost' problem

Sun Feb 17, 2019 12:13 am

[There is another topic on this issue (https://www.raspberrypi.org/forums/view ... t#p1393787), but starting a new thread since this is a different scenario and potentially a different issue]

Configuration
  • Raspberry Pi 3B+
  • Raspbian 2018-11-13-{lite|full} (see problem description)
  • eero WiFi network
PROBLEM: On Raspbian 2018-11-13-stretch-full, WiFi works correctly, but on 2018-11-13-stretch-lite, WiFi network bounces every 6 minutes. A bounce looks like:

Code: Select all

Feb 16 14:51:49 rpi3plus dhcpcd[479]: wlan0: carrier lost
Feb 16 14:51:49 rpi3plus dhcpcd[479]: wlan0: deleting address fe80::ccc:fcd7:e5cf:9cf7
Feb 16 14:51:49 rpi3plus avahi-daemon[328]: Withdrawing address record for fe80::ccc:fcd7:e5cf:9cf7 on wlan0.
Feb 16 14:51:49 rpi3plus avahi-daemon[328]: Leaving mDNS multicast group on interface wlan0.IPv6 with address fe80::ccc:fcd7:e5cf:9cf7.
Feb 16 14:51:49 rpi3plus avahi-daemon[328]: Interface wlan0.IPv6 no longer relevant for mDNS.
Feb 16 14:51:49 rpi3plus dhcpcd[479]: wlan0: deleting default route via 192.168.92.1
Feb 16 14:51:49 rpi3plus dhcpcd[479]: wlan0: deleting route to 192.168.92.0/24
Feb 16 14:51:49 rpi3plus avahi-daemon[328]: Withdrawing address record for 192.168.92.202 on wlan0.
Feb 16 14:51:49 rpi3plus avahi-daemon[328]: Leaving mDNS multicast group on interface wlan0.IPv4 with address 192.168.92.202.
Feb 16 14:51:49 rpi3plus avahi-daemon[328]: Interface wlan0.IPv4 no longer relevant for mDNS.
Feb 16 14:51:52 rpi3plus dhcpcd[479]: wlan0: carrier acquired
Feb 16 14:51:52 rpi3plus dhcpcd[479]: wlan0: IAID eb:98:c7:cb
Feb 16 14:51:52 rpi3plus dhcpcd[479]: wlan0: adding address fe80::ccc:fcd7:e5cf:9cf7
Feb 16 14:51:52 rpi3plus dhcpcd[479]: wlan0: soliciting an IPv6 router
Feb 16 14:51:52 rpi3plus dhcpcd[479]: wlan0: rebinding lease of 192.168.92.202
Feb 16 14:51:52 rpi3plus dhcpcd[479]: wlan0: probing address 192.168.92.202/24
Feb 16 14:51:53 rpi3plus avahi-daemon[328]: Joining mDNS multicast group on interface wlan0.IPv6 with address fe80::ccc:fcd7:e5cf:9cf7.
Feb 16 14:51:53 rpi3plus avahi-daemon[328]: New relevant interface wlan0.IPv6 for mDNS.
Feb 16 14:51:53 rpi3plus avahi-daemon[328]: Registering new address record for fe80::ccc:fcd7:e5cf:9cf7 on wlan0.*.
Feb 16 14:51:58 rpi3plus dhcpcd[479]: wlan0: leased 192.168.92.202 for 86400 seconds
Feb 16 14:51:58 rpi3plus avahi-daemon[328]: Joining mDNS multicast group on interface wlan0.IPv4 with address 192.168.92.202.
Feb 16 14:51:58 rpi3plus dhcpcd[479]: wlan0: adding route to 192.168.92.0/24
Feb 16 14:51:58 rpi3plus dhcpcd[479]: wlan0: adding default route via 192.168.92.1
Feb 16 14:51:58 rpi3plus avahi-daemon[328]: New relevant interface wlan0.IPv4 for mDNS.
Feb 16 14:51:58 rpi3plus avahi-daemon[328]: Registering new address record for 192.168.92.202 on wlan0.IPv4.
Feb 16 14:52:05 rpi3plus dhcpcd[479]: wlan0: no IPv6 Routers available
Repeating for emphasis, using Raspbian 2018-11-13-stretch-full on the same physical Pi3B+ on the same network in exactly the same phyiscal location, the WiFi does NOT bounce. But, using 2018-11-13-stretch-lite, it bounces every 6 minutes. Of course, I greatly prefer using stretch-lite, so I'm hoping to chase this problem into submission ;)

What I've tried
  • Looked through file system for obvious network-related differences. None found.
  • Ensured /etc/wpa_supplicant/wpa_supplicant.conf identical on both systems
  • Validated that the Pi connects to the same eero station and WiFi can carry traffic in both scenarios. But the lite system WiFi bounces.
  • Tried iw dev wlan0 set power_save off. No joy.
  • Looked at output of various network and wifi commands for obvious differences between lite and full. None found.
    • iwconfig
    • iw phy0 channels
    • iw dev
    • iw dev wlan0 info
    • iw dev wlan0 link
    • iw dev wlan0 station dump
    The only difference I found was that the device sometimes connects on a different channel (per iw dev wlan0 info).

What rock have I not turned over in my search? Thanks for your guidance and suggestions.

epoch1970
Posts: 3875
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: Need help debugging wlan0 'carrier lost' problem

Sun Feb 17, 2019 10:18 am

With the full version, are you running with keyboard and or mouse attached to the Pi?
It could be a random number generation issue.
Running wpa_supplicant in debug mode should give more insight on what is really happening.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

bls
Posts: 289
Joined: Mon Oct 22, 2018 11:25 pm
Location: Seattle, WA

Re: Need help debugging wlan0 'carrier lost' problem

Tue Feb 19, 2019 12:24 am

As I mentioned, the hardware is exactly the same (kb and mouse plugged into exactly the same Pi hardware). So, if there was some issue with the keyboard or mouse, seems like both lite and full should experience it, shouldn't they?

Thanks for the wpa_supplicant log suggestion. I grabbed logs from both scenarios (lite and full). At the outset, diffing the logs (ignoring the time field) yields mostly equivalence, until just before the carrier drops in the lite scenaro. At that point, I observe the following:

Full system

Code: Select all

Feb 18 11:12:47: CTRL_IFACE monitor attached /tmp/dhcpcd-pi/libdhcpcd-wpa-963.3\x00
Feb 18 11:12:47: wlan0: Control interface command 'STATUS'
Feb 18 11:12:47: EAPOL: EAP Session-Id not available
Feb 18 11:12:47: wlan0: Control interface command 'BSS 0'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 1'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 2'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 3'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 4'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 5'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 6'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 7'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 8'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 9'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 10'
Feb 18 11:12:47: wlan0: Control interface command 'BSS 11'
Feb 18 11:12:47: wlan0: Control interface command 'STATUS'

But on the Lite system, I see

Code: Select all

Feb 17 20:26:23: wlan0: BSS: Remove id 0 BSSID 4c:01:43:69:21:1f SSID 'OurFiNotYourFi' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 2 BSSID 4c:01:43:65:49:f6 SSID 'myssid' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 3 BSSID 4c:01:43:65:49:f3 SSID '91b55' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 4 BSSID 4c:01:43:65:49:ff SSID 'OurFiNotYourFi' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 5 BSSID 4c:01:43:64:5f:ff SSID 'OurFiNotYourFi' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 6 BSSID 4c:01:43:64:5f:f6 SSID 'myssid' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 7 BSSID a0:63:91:cc:43:3b SSID 'Chapman' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 8 BSSID 44:1c:12:02:4c:01 SSID 'Comcast Box' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 9 BSSID 00:1f:90:e4:cf:97 SSID '30CL0' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 10 BSSID 4c:01:43:65:49:f9 SSID '"' due to wpa_bss_flush_by_age
Feb 17 20:26:23: wlan0: BSS: Remove id 11 BSSID 44:1c:12:02:4c:03 SSID 'xfinitywifi' due to wpa_bss_flush_by_age
Feb 17 20:29:00: nl80211: Event message available
Feb 17 20:29:00: nl80211: Drv Event 48 (NL80211_CMD_DISCONNECT) received for wlan0
Feb 17 20:29:00: nl80211: Disconnect event

Thinking that maybe there was a driver or firmware issue, I looked at that. I noticed that the file /lib/formware/brcm/BCM43450C0.hcd was different between the two systems. (Not sure how that could happen since they were both freshly installed and upgraded???). I copied the file from Full to Lite and rebooted. No joy.

I'm not sure where to look next, but thinking something with the driver or dhcpcd(?). I think wpa_supplicant has been exonerated. Thanks for the suggestion...appreciate any further thoughts on this.

bls
Posts: 289
Joined: Mon Oct 22, 2018 11:25 pm
Location: Seattle, WA

Re: Need help debugging wlan0 'carrier lost' problem

Tue Feb 19, 2019 12:58 am

Looks like the file /tmp//tmp/libdhcpcd-wpa might be created by lxpanel? I also notice that the file is own by pi.pi, which to me, perhaps wrongly, indicates that some user, non-system code is creating it. If it's something used by the GUI, it's quite strange that it would cause the WiFi network to think carrier has dropped.


Return to “Networking and servers”