Page 1 of 1

Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Fri Mar 30, 2018 1:10 pm
by darksky
I have a RPi 3B+ running Arch ARM (armv7h). When I try compiling the kernel package, I am finding dmesg full of kernel oops like the below. The compilation eventually just freezes up but the system itself is responsive. When I monitor the temps on the 3B+, they don't exceed 72C so I don't think it's thermal related. As well, I can run cpuburn-a53 for hours without instability (temps around 85C). I believe that the software (disto) on the micro SD card is NOT to blame because , if I put the same micro SD card into a RPi3 or RPi2, I can compile without error.

I am using an nfs mounted partition (/scratch) to compile on, so I'm hypothesizing that my problems are related to the network driver.

Any thoughts are welcomed.

Code: Select all

[ 2455.534291] INFO: task ld:24879 blocked for more than 120 seconds.
[ 2455.538489]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2455.542688] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2455.550990] ld              D    0 24879  24804 0x00000000
[ 2455.555379] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2455.559662] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2455.563990] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2455.572326] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2455.580865] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2455.589272] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2455.597837] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2455.606295] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2455.610675] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2455.614999] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2547.695051] nfs: server ease not responding, still trying
[ 2548.735626] nfs: server ease not responding, still trying
[ 2548.768826] nfs: server ease OK
[ 2548.796748] nfs: server ease OK
[ 2701.296329] INFO: task ld:24879 blocked for more than 120 seconds.
[ 2701.300214]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2701.304061] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2701.311642] ld              D    0 24879  24804 0x00000000
[ 2701.315536] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2701.319458] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2701.323355] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2701.330878] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2701.338447] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2701.345916] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2701.353469] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2701.360953] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2701.364740] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2701.368593] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2772.976750] nfs: server ease not responding, still trying
[ 2774.331264] nfs: server ease OK
[ 2947.057892] INFO: task ld:24879 blocked for more than 120 seconds.
[ 2947.061907]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2947.066031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2947.074107] ld              D    0 24879  24804 0x00000000
[ 2947.078244] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2947.081483] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2947.084348] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2947.090033] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2947.095898] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2947.101751] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2947.107513] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2947.113352] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2947.116350] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2947.119289] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2998.258064] nfs: server ease not responding, still trying
[ 2999.352463] nfs: server ease OK
[ 3192.819075] INFO: task ld:24879 blocked for more than 120 seconds.
[ 3192.823185]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3192.827330] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3192.835447] ld              D    0 24879  24804 0x00000000
[ 3192.839604] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3192.842832] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3192.845750] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3192.851476] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3192.857318] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3192.863126] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3192.868837] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3192.874594] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3192.877558] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3192.880466] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3223.539141] nfs: server ease not responding, still trying
[ 3224.579687] nfs: server ease not responding, still trying
[ 3224.612015] nfs: server ease OK
[ 3224.626000] nfs: server ease OK
[ 3438.580109] INFO: task objcopy:24916 blocked for more than 120 seconds.
[ 3438.583905]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3438.587697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3438.595231] objcopy         D    0 24916  24912 0x00000000
[ 3438.599109] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3438.603019] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3438.606896] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3438.614435] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3438.622018] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3438.629666] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3438.637259] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3438.644894] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3438.648704] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3438.652599] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3448.820081] nfs: server ease not responding, still trying
[ 3450.148878] nfs: server ease OK
[ 3674.100906] nfs: server ease not responding, still trying
[ 3675.141506] nfs: server ease not responding, still trying
[ 3675.174279] nfs: server ease OK
[ 3675.202048] nfs: server ease OK
[ 3807.221430] INFO: task objcopy:24916 blocked for more than 120 seconds.
[ 3807.225253]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3807.229007] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3807.236459] objcopy         D    0 24916  24912 0x00000000
[ 3807.240428] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3807.244393] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3807.248202] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3807.255540] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3807.263030] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3807.270494] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3807.277992] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3807.285364] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3807.289292] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3807.293169] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3899.381659] nfs: server ease not responding, still trying
[ 3900.422241] nfs: server ease not responding, still trying
[ 3900.461112] nfs: server ease OK
[ 3900.474540] nfs: server ease OK
[ 4011.372575] nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based  firewall rule not found. Use the iptables CT target to attach helpers instead.
[ 4052.982250] INFO: task as:25088 blocked for more than 120 seconds.
[ 4052.986324]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4052.990389] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4052.998504] as              D    0 25088  25086 0x00000000
[ 4053.002785] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4053.006065] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4053.008960] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4053.014564] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4053.020330] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4053.026110] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4053.031705] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4053.037527] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4053.040507] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4053.043431] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4134.902727] nfs: server ease not responding, still trying
[ 4135.997194] nfs: server ease OK
[ 4529.145918] nfs: server ease not responding, still trying
[ 4529.145923] nfs: server ease not responding, still trying
[ 4529.145940] nfs: server ease not responding, still trying
[ 4529.145978] nfs: server ease not responding, still trying
[ 4529.146011] nfs: server ease not responding, still trying
[ 4529.146028] nfs: server ease not responding, still trying
[ 4529.146044] nfs: server ease not responding, still trying
[ 4538.105971] nfs: server ease not responding, still trying
[ 4538.109131] nfs: server ease not responding, still trying
[ 4544.506128] INFO: task gcc:2854 blocked for more than 120 seconds.
[ 4544.509193]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4544.512157] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4544.517957] gcc             D    0  2854   2852 0x00000000
[ 4544.520871] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4544.523830] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4544.526762] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4544.530980] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4544.534883] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4544.538880] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4544.542873] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4544.546949] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4544.549173] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4544.551445] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4571.406855] nfs: server ease OK
[ 4571.406996] nfs: server ease OK
[ 4571.407031] nfs: server ease OK
[ 4571.407691] nfs: server ease OK
[ 4571.407701] nfs: server ease OK
[ 4571.410844] nfs: server ease OK
[ 4571.410877] nfs: server ease OK
[ 4571.411761] nfs: server ease OK
[ 4571.411810] nfs: server ease OK
[ 4790.267644] INFO: task ld:7630 blocked for more than 120 seconds.
[ 4790.270597]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4790.273588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4790.279563] ld              D    0  7630   7628 0x00000000
[ 4790.282558] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4790.285531] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4790.288488] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4790.294136] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4790.299855] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4790.305556] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4790.311366] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4790.317112] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4790.320380] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4790.323699] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4790.330500] INFO: task ld:7636 blocked for more than 120 seconds.
[ 4790.334181]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4790.338097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4790.346223] ld              D    0  7636   7633 0x00000000
[ 4790.350304] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4790.354463] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4790.358593] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4790.366494] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4790.374744] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4790.383021] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4790.391236] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4790.399371] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4790.403607] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4790.407831] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Fri Mar 30, 2018 1:59 pm
by PeterO
Can you run "rpi-update" on Arch ? I believe the very latest kernel/firmware have updates that might help you .
PeterO

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Fri Mar 30, 2018 2:34 pm
by darksky
Yes, the system is up-to-date. I compiled 4.14.31 myself (distro is currently on 4.14.30). Here are the relevant versions of the firmware packages I have:

Code: Select all

% pacman -Qs firmware
local/firmware-raspberrypi 3-1
    Additional firmware for Raspberry Pi
local/linux-firmware 20180314.4c0bf11-1
    Firmware files for Linux
local/raspberrypi-firmware 20180207-1
    Firmware tools, libraries, and headers for Raspberry Pi
EDIT: I just updated to the updated raspberrypi-firmware based on this commit (https://github.com/raspberrypi/firmware ... 98ec11a6e7). Let's see if that helps.

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Fri Mar 30, 2018 3:24 pm
by darksky
Nope... even with the latest firmware:

Code: Select all

[ 2578.410282] INFO: task grep:25002 blocked for more than 120 seconds.
[ 2578.410441]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2578.410572] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2578.410748] grep            D    0 25002  25000 0x00000000
[ 2578.410923] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2578.411106] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2578.411281] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2578.411471] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2578.411683] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2578.411929] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2578.416061] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2578.424238] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2578.428524] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2578.432736] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2624.491682] nfs: server ease not responding, still trying
[ 2625.532267] nfs: server ease not responding, still trying
[ 2625.564173] nfs: server ease OK
[ 2625.591905] nfs: server ease OK

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Fri Mar 30, 2018 8:16 pm
by darksky
An easy way to trigger this bug (if you don't want to try compiling the kernel package) is to simply use `dd` to write out from `/dev/zero` to the NFS mount. For example on my RPi3 B+:

Code: Select all

# mount ease:/scratch /scratch-nfs
% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress
964689920 bytes (965 MB, 920 MiB) copied, 149 s, 6.5 MB/s

<<< it froze up after about 965 MB written >>>
<<< In dmesg I get another server not responding error >>>

[ 5112.824818] nfs: server ease not responding, still trying
[ 5149.707808] nfs: server ease OK
Now, if I swap out the micro SD and boot into a RPi 2 I have lying around, same network cable, same power supply, and repeat the commands, everything works as expected. I think that helps to rule out the NFS server, network hardware etc. as potentially to blame.

Code: Select all

# mount ease:/scratch /scratch-nfs
% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress
4194304000 bytes (4.2 GB, 3.9 GiB) copied, 346 s, 12.1 MB/s
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB, 3.9 GiB) copied, 357.595 s, 11.7 MB/s
dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress  0.00s user 24.47s system 5% cpu 8:03.99 total
I opened up #2482 on github with all this info but feel free to continue the discussion here:

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Thu Apr 11, 2019 8:20 am
by oliverlj
Hello,

I don't know if my problem is related but i setup my root folder on a nfs share. I've setup a swap on this nfs as well. I use an application which is memory eater (970 MB of RAM). If I do sudo apt update, my raspberry hung ! (no log writing, no ssh access).

I have move the swap on a sd card and I don't see the problem.

I have nothing in the kernel (4.19-32). How can I debug my issue ?

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Thu Apr 11, 2019 10:54 am
by jamesh
oliverlj wrote:
Thu Apr 11, 2019 8:20 am
Hello,

I don't know if my problem is related but i setup my root folder on a nfs share. I've setup a swap on this nfs as well. I use an application which is memory eater (970 MB of RAM). If I do sudo apt update, my raspberry hung ! (no log writing, no ssh access).

I have move the swap on a sd card and I don't see the problem.

I have nothing in the kernel (4.19-32). How can I debug my issue ?
I believe swap over NFS requires some moresteps over and above a simple mount.

viewtopic.php?t=7720

TBH swap/NFS doesn't sound like a good idea to me anyway.

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Thu Apr 11, 2019 7:25 pm
by oliverlj
thank you jamesh for your answer.

I did mount a swap as a loop device. but the result is the same. the pi hung as a result

Re: Raspberry Pi 3 B+ experiencing tons of kernel oops when compiling over NFS

Posted: Fri Apr 12, 2019 2:38 am
by swampdog
My memory may not be correct on this but I *think* I had this problem ages ago. In case it helps..

Code: Select all

[email protected]:~ $ dfh
192.168.1.20:/mnt/nfsd/pi05  63G   45G   16G   75%  /
mmcblk0p1                    43M   22M   21M   52%  /boot
mmcblk0p2                    3.6G  3.1G  382M  89%  /swap

[email protected]:~/usr/src/gcc $ lc -h /swap/dphys 
-rw------- 1 root root 1.9G Aug 31  2018 /swap/dphys

[email protected]:~ $ cat /etc/fstab | grep ^/
/dev/mmcblk0p1  /boot           vfat    defaults          0       2
/dev/mmcblk0p2	/swap		ext4	defaults	0 1

[email protected]:~ $ cat /boot/cmdline.txt
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/nfs smsc95xx.turbo_mode=N nfsroot=192.168.1.20:/mnt/nfsd/pi05,tcp,vers=3 ip=::::pi05.swampdog::dhcp rootfstype=nfs elevator=deadline rw rootwait
192.168.1.20..

Code: Select all

[[email protected] ~]$ cat /etc/exports | grep pi05
/mnt/nfsd/pi05	pi05.swampdog(rw,sync,no_root_squash,no_subtree_check)

[[email protected] ~]$ cat /etc/redhat-release 
CentOS release 6.9 (Final)
pi05 booting..

Code: Select all

[email protected]:~ $ dmesg | grep -A 5 IP-C
[    7.698964] IP-Config: Got DHCP answer from 192.168.1.8, my address is 192.168.1.35
[    7.710526] IP-Config: Complete:
[    7.717590]      device=eth0, hwaddr=b8:27:eb:d8:01:82, ipaddr=192.168.1.35, mask=255.255.255.192, gw=192.168.1.1
[    7.732176]      host=pi05, domain=swampdog, nis-domain=swampdog
[    7.742456]      bootserver=192.168.1.4, rootserver=192.168.1.20, rootpath=     nameserver0=192.168.1.8
[    7.768060] VFS: Mounted root (nfs filesystem) on device 0:15.
[    7.779252] devtmpfs: mounted
DNS/DHCP server..

Code: Select all

[email protected]:~/etc $ cat dhcpd.conf | egrep -A 3 "^host[[:space:]]{1,}pi05"
host pi05 {
	hardware ethernet b8:27:eb:d8:01:82;
	fixed-address pi05.swampdog;
	}

[email protected]:/var/cache/bind $ cat zone.swampdog-f | grep pi05
pi05			A	192.168.1.35

[email protected]:/var/cache/bind $ cat zone.swampdog-r | grep pi05
35			PTR	pi05.swampdog.