rfreire
Posts: 6
Joined: Tue Apr 30, 2013 3:17 pm

Pi 3B+ NAT capabilities

Sat Mar 17, 2018 11:11 pm

Hi hi guys;

Anyone here benchmarked Pi 3B+ NAT performance?

I have a Pi 3 which is my NAT device, where I had 10 Mbps until last week. And then got upgraded to 100 Mbps.

I can barely exceed 60-70 Mbps. My bottleneck seem to be ksoftirqd/0.

I'm pointing NAT to blame because I can reach close to 100 Mbps when doing tests in the same network, without routing or NATing involved.

What's your findings in Pi 3B+?

Delighted to hear.

- RF.

epoch1970
Posts: 1784
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: Pi 3B+ NAT capabilities

Sun Mar 18, 2018 4:34 pm

You can look at reports about USB gigabit adapters on Pi 3 and that will give you a ballpark idea of what to expect.

Bandwidth is not everything, with TCP you also have to look at latency (ping speed). Latency over public links is huge compared to local network links.
There is a good chance your current Pi is not much of a bottleneck, even for sustained uploads or downloads.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

User avatar
manawyrm
Posts: 30
Joined: Mon Apr 15, 2013 6:18 am
Location: Alfeld (Leine), Germany
Contact: Website

Re: Pi 3B+ NAT capabilities

Mon Mar 19, 2018 10:49 am

You want to get something with proper NICs and HW routing capability. HW from companies like Ubiquiti or MikroTik usually works really well... (is similarly priced to a Pi).

With 2 USB NICs (one internal 3B+, one external with AX88179 chip), I can get on the edge of 130 mbit/s of NAT throughput. You can probably fiddle around a bit with settings like in /proc/interrupts to get the load to distribute over the CPU cores. That could get you the last bit of performance.

But yes, as epoch1970 already said, latency (and also jitter) are important network characteristics. I wouldn't want to tunnel my network through crappy USB adapters, that's just painful.

Best wishes,
Tobias

mfa298
Posts: 1212
Joined: Tue Apr 22, 2014 11:18 am

Re: Pi 3B+ NAT capabilities

Mon Mar 19, 2018 11:25 am

rfreire wrote:
Sat Mar 17, 2018 11:11 pm
Anyone here benchmarked Pi 3B+ NAT performance?

I have a Pi 3 which is my NAT device, where I had 10 Mbps until last week. And then got upgraded to 100 Mbps.

I can barely exceed 60-70 Mbps. My bottleneck seem to be ksoftirqd/0.

I'm pointing NAT to blame because I can reach close to 100 Mbps when doing tests in the same network, without routing or NATing involved.

What's your findings in Pi 3B+?
As Routing, NAT and any other features are all handled within the kernel and so on the CPU adding more complexity (i.e. adding NAT) will reduce the total throughput achievable. You'll see this on most devices (even the Cisco ISR routers I've used will slow down as you add NAT etc.) as it's expected most people designing such hardware will take that into account when choosing the CPU and other dedicated hardware (some higher end routers can offload some work into ASICs leaving more CPU cycles free for the more complex stuff).

Lots of time handling ksoftirqd seems plausible - I suspect that's related to the incoming packets from the interface(s), having dealt with much larger firewalls (2U servers with lots of 1G or 10G interfaces) I'd expect to see whole CPU cores dealing with the interrupts from the various interfaces.

User avatar
allfox
Posts: 425
Joined: Sat Jun 22, 2013 1:36 pm
Location: Guang Dong, China

Re: Pi 3B+ NAT capabilities

Wed Mar 21, 2018 6:06 pm

Greetings. I'm making a late reply.

I would like to know how to measure the performance of ksoftirqd, if OP could tell me how he got known the bottleneck is at ksoftirqd, please.

I'm asking because I'm using NIC scaling, but I don't know if it's working: https://github.com/raspberrypi/linux/bl ... caling.txt

The following info might be redundant for you guys, just jump when you already know it please :) .

When the Ethernet chip receive a packet, it would directly write it to memory (e.g. DMA), then send a signal to CPU (e.g. interrupt).

However, under Linux, CPU at this point will not process the packet at once. It would schedule a plan about the packet, then go back to its own previous job.

At some time point in the future, the CPU would be idle, so she read the plan, start a kernel thread, and process the packet. This kernel thread is ksoftirqd.

In 32 bits official kernel, by default, the CPU core who received this packet's signal at the very beginning, would also do the ksoftirqd. This means even there are 4 cores, all Ethernet incoming processing goes to core zero.

There is a patch in official 64 bits kernel which would distribute the signal to all 4 cores: https://github.com/raspberrypi/linux/co ... ccbe8461e2
But I guess we won't get a official release in near future.

Then here comes another distributing solution, the Receive Packet Steering.
The idea is that when core zero making the plan, she plan the ksoftirq job to another core.
So even core zero receive all the signal, the actual processing would be distributing around all 4 cores.

Other detail and other technique could be found in that NIC scaling document at the beginning of this post.

I have a Pi 2 V1.1 router, who has a script like this:

Code: Select all

#!/bin/sh

CPUS="f"
SOCK_FLOW_ENTRIES=32768

echo $SOCK_FLOW_ENTRIES > /proc/sys/net/core/rps_sock_flow_entries

echo $CPUS > /proc/sys/net/core/flow_limit_cpu_bitmap

for DEVICE in /sys/class/net/*
do
    echo $CPUS > "$DEVICE/queues/rx-0/rps_cpus"
    echo $SOCK_FLOW_ENTRIES > "$DEVICE/queues/rx-0/rps_flow_cnt"
done

exit 0

I don't know if it really make a difference. So if you could tell me how to measure ksoftirqd performance, I would be happy. ;)

Return to “Advanced users”

Who is online

Users browsing this forum: No registered users and 17 guests