I have a 4TB WD green drive attached to an RPI4 running Raspbian Buster. I can boot the system, the drive will mount (from fstab) and all is good. For a while. Then at some point I get I/O errors, the drive goes offline and then comes back on a different device name (e.g. starts as sda, comes back as sdb). Needless to say, this totally messes up the mount, which is still shown mounted on the old device name. Any attempt to access the mount will give an I/O error. The only way to fix it (other than rebooting) is to stop the nfs-server, stop smbd, and remount on the new device. Then I can restart nfs-server and smbd, and it works fine until the next time.
This is my fstab line:
LABEL=/BIGPUB /pub ext4 defaults,nofail,rw 0 0
And this is what appears in dmesg output:
[ 1.850699] sd 0:0:0:0: [sda] 976746240 4096-byte logical blocks: (4.00 TB/3.64 TiB)
[ 1.851993] sd 0:0:0:0: [sda] Write Protect is off
[ 1.852007] sd 0:0:0:0: [sda] Mode Sense: 47 00 10 08
[ 1.853367] sd 0:0:0:0: [sda] No Caching mode page found
[ 1.853450] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 1.882558] sda: sda1
[ 1.885503] sd 0:0:0:0: [sda] Attached SCSI disk
[ 13.966929] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[ 687.723049] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[70499.477489] print_req_error: I/O error, dev sda, sector 22492864
[70504.466441] sd 1:0:0:0: [sdb] 976746240 4096-byte logical blocks: (4.00 TB/3.64 TiB)
[70504.467307] sd 1:0:0:0: [sdb] Write Protect is off
[70504.467325] sd 1:0:0:0: [sdb] Mode Sense: 47 00 10 08
[70504.468077] sd 1:0:0:0: [sdb] No Caching mode page found
[70504.468092] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[70504.507030] sdb: sdb1
[70504.511107] sd 1:0:0:0: [sdb] Attached SCSI disk
[70505.532417] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.532538] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.535216] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.535268] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.565543] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.565595] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.582608] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70505.582655] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70537.539171] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70537.539226] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821829: comm smbd: reading directory lblock 0
[70538.007080] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70538.007132] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70538.009140] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70538.009183] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70567.167770] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70567.167826] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70567.189071] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70567.189123] EXT4-fs error (device sda1): ext4_find_entry:1455: inode #195821767: comm smbd: reading directory lblock 0
[70577.867979] EXT4-fs warning (device sda1): dx_probe:759: inode #2: lblock 0: comm ls: error -5 reading directory block
[70684.159996] Buffer I/O error on dev sda1, logical block 488144896, lost sync page write
[70684.160009] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[70684.160017] Aborting journal on device sda1-8.
[70684.160029] Buffer I/O error on dev sda1, logical block 488144896, lost sync page write
[70684.160039] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[70709.126662] EXT4-fs (sdb1): recovery complete
[70709.126871] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
The series of lines beginning at 70505 are attempts to access the old device. The last two lines are remounting on the new device.
Any clues how I can troubleshoot this?
Thank you,
--Greg