Greetings. I'm making a late reply.
I would like to know how to measure the performance of ksoftirqd, if OP could tell me how he got known the bottleneck is at ksoftirqd, please.
I'm asking because I'm using NIC scaling, but I don't know if it's working: https://github.com/raspberrypi/linux/bl ... caling.txt
The following info might be redundant for you guys, just jump when you already know it please
When the Ethernet chip receive a packet, it would directly write it to memory (e.g. DMA), then send a signal to CPU (e.g. interrupt).
However, under Linux, CPU at this point will not process the packet at once. It would schedule a plan about the packet, then go back to its own previous job.
At some time point in the future, the CPU would be idle, so she read the plan, start a kernel thread, and process the packet. This kernel thread is ksoftirqd.
In 32 bits official kernel, by default, the CPU core who received this packet's signal at the very beginning, would also do the ksoftirqd. This means even there are 4 cores, all Ethernet incoming processing goes to core zero.
There is a patch in official 64 bits kernel which would distribute the signal to all 4 cores: https://github.com/raspberrypi/linux/co ... ccbe8461e2
But I guess we won't get a official release in near future.
Then here comes another distributing solution, the Receive Packet Steering.
The idea is that when core zero making the plan, she plan the ksoftirq job to another core.
So even core zero receive all the signal, the actual processing would be distributing around all 4 cores.
Other detail and other technique could be found in that NIC scaling document at the beginning of this post.
I have a Pi 2 V1.1 router, who has a script like this:
Code: Select all
echo $SOCK_FLOW_ENTRIES > /proc/sys/net/core/rps_sock_flow_entries
echo $CPUS > /proc/sys/net/core/flow_limit_cpu_bitmap
for DEVICE in /sys/class/net/*
echo $CPUS > "$DEVICE/queues/rx-0/rps_cpus"
echo $SOCK_FLOW_ENTRIES > "$DEVICE/queues/rx-0/rps_flow_cnt"
I don't know if it really make a difference. So if you could tell me how to measure ksoftirqd performance, I would be happy.