Finding local network bottlenecks with netcat

Let’s do some low-overhead testing of network throughput. We can use Netcat (nc, available on many Linux distros and macOS) to establish a direct TCP connection between two machines and directly send data between them.

Shovel useless data to another machine as quickly as possible

First, pick a machine as a server — a NAS in my case — and listen for TCP connections on port 5555 (the -p may not be necessary depending on your version of nc). There are two redirects to ignore data received from the client and any accidental keypresses in the terminal¹.

nc -l -p 5555 </dev/null  >/dev/null

Now from the client, pipe a known quantity of zero-value bytes to the server (assuming the server is IP 10.0.0.100).

dd if=/dev/zero bs=1500 count=1000000 | nc 10.0.0.100 5555

This sends 1.5GB of zeroes and summarizes the speed when done:

❯ dd if=/dev/zero bs=1500 count=1000000 | nc 10.0.0.100 5555
[..]
1500000000 bytes (1.5 GB, 1.4 GiB) copied, 13.2083 s, 114 MB/s

The /dev/zero pseudo-file is a mechanism provided by the kernel that creates a limitless supply of bytes with a value of 0. The kernel can generate these very quickly so it’s a good way to ensure we push as much data as possible through the TCP connection established with netcat².

Sending data the other direction

You can reverse the roles of the client and server if you want to do a bi-directional test. In my case, though, I wanted to continue using the NAS as the server because I’m lazy and didn’t want to change firewall settings.

With some versions of netcat³, we can tell the server to send useless data to a client once one has connected. The summary output will be written to a file:

nc -l -p 5555 -e 'dd if=/dev/zero bs=1500 count=1000000 2>summary'

Depending on your version of nc, you may need -c instead of -e.

If you want to send useful data, like the contents of a file⁴, just specify that instead (this transfer is unencrypted, of course):

nc -l -p 5555 -e 'dd if=/path/to/file bs=1500 count=1000000 2>summary'

Have a client connect to the server and discard the data it receives:

nc 10.0.0.100 5555 </dev/null >/dev/null

Once this is done, look at the summary on the server:

$ cat summary
[..]
1500000000 bytes (1.5 GB) copied, 13.916 s, 108 MB/s

Cool, so what?

I installed some speedy new disks into my NAS so it was time to re-evaluate speeds and find bottlenecks. At first, I tried copying data using SMB and the performance was a paltry 30MB/s. I wasn’t sure if there was a network misconfiguration, protocol-related overhead (likely⁵), or that the NAS is simply too underpowered to read from or write to the volume fast enough.

I sent data to and from the NAS to evaluate the raw network speed and the impact of I/O. Baseline measurements discarded all transmitted data, and I/O measurements wrote data sent from the client to a file or transferred a file read from disk. All speeds are megabytes per second (MB/s).

Measurement	Run 1	Run 2	Run 3	Average
NAS write to `/dev/null`	114	114	112	113.3
NAS write to file	43.9	44.8	43.2	44.0
NAS read from `/dev/zero`	116	115	115	115.3
NAS read from file	82.9	77.4	75.5	78.6

Doesn’t look good for the NAS; the network is fine, but also reading or writing data from the volume significantly drops performance. But the NAS has no reason to worry. These tests were over wired gigabit ethernet but I usually connect my laptop to the network via wifi. My max wifi data rate is around 30MB/s (35MB/s from the couch), so the NAS is not the limiting factor. I see you, wifi attentuation and frequency contention. I guess if I’m gonna replace anything first, it should be my neighbors’ wifi.

Extras

Optional: use Pipe Viewer for a progress bar

Install and use Pipe Viewer (pv) for a progress bar:

❯ dd if=/dev/zero bs=1500 count=1000000 | pv | nc 10.0.0.100 5555
362MiB 0:00:04 [89.8MiB/s] [    <=>                    ]

Pay attention to units. In this case, pv outputs data rates in mebibytes (MiB) but dd outputs rates in megabytes (MB). Since gigabit is 1000 megabits, using megabytes is more consistent.

Pitfall: Random data instead of zeroes

Using /dev/urandom as the source in dd if=/dev/urandom is a nice way to send non-zero data but this can be very slow. On my NAS using /dev/urandom will drop speeds to low-single digit MB/s!

$ dd if=/dev/urandom of=/dev/null bs=1500 count=10000
[..]
15000000 bytes (15 MB) copied, 5.39901 s, 2.8 MB/s

$ dd if=/dev/zero of=/dev/null bs=1500 count=10000
[..]
15000000 bytes (15 MB) copied, 0.0177565 s, 845 MB/s

The drop in performance is true even on faster machines, like a 2016 Intel MacBook Pro:

❯ dd if=/dev/urandom of=/dev/null bs=1500 count=1000000
[..]
1500000000 bytes (1.5 GB, 1.4 GiB) copied, 6.15398 s, 244 MB/s

❯ dd if=/dev/zero of=/dev/null bs=1500 count=1000000
[..]
1500000000 bytes (1.5 GB, 1.4 GiB) copied, 2.0593 s, 728 MB/s

Footnotes

It’s pretty useful to just use nc without any redirection too. After establishing a basic connection with nc you can type in one terminal and see the output on the other (perhaps after hitting enter). This works bi-directionally. This is, incidentally, a nice alternative to the now-defunct telnet to issue direct HTTP requests to servers — or any other human readable protocol, for that matter. For example, you can manually send an HTTP request by establishing a connection to nc danallan.net 80 and then enter the following request headers. The extra blank line is necessary so the server knows you’ve completed all the headers:
```
GET / HTTP/1.1
Host: danallan.net
Connection: close
```
The server will send you an HTTP response. ↩
You may want to make sure your machine can generate data fast enough for the network speed you are testing. We can use dd by itself and output to the null device to see how quickly a machine can generate the data.
```
❯ dd if=/dev/zero of=/dev/null bs=1500 count=1000000
[..]
1500000000 bytes (1.5 GB, 1.4 GiB) copied, 2.0593 s, 728 MB/s
```
Plenty fast for a gigabit connection (125 MB/s theoretical max). But the throughput changes with different values for the block size (bs) and number of blocks (count). Here we generate the same quantity of data, 1.5GB, but with 10x larger block size and corresponding order of magnitude reduction in block count, and see nearly an 8 fold increase!
```
❯ dd if=/dev/zero of=/dev/null bs=15000 count=100000
[..]
1500000000 bytes (1.5 GB, 1.4 GiB) copied, 0.267429 s, 5.6 GB/s
```
What’s unclear to me is the interaction between the block size and any buffering that occurs in the network layer. Is there any difference to, say, adjust the size of the block size to the packet size? ↩
Notably, the version of netcat pre-installed on macOS does not support this capability. But you can install GNU netcat with Homebrew: brew install netcat. ↩
Make sure the file is big enough! Following the values here, it should be at least 1.5GB. ↩
SSH File Transfer Protocol (sftp) is quite slow from my NAS, about 25MB/s whereas sending the data direct via nc as described in this article averages 78.6MB/s. Other protocols like SMB are also slow, though not quite as bad. It’s not quite clear why sftp is so comparatively glacial. Encryption overhead may play a part since the NAS has a low-powered embedded CPU, but the CPU usage is not overly high during transfer. Similarly, there appears to be additional buffers that create a complex interaction with TCP flow control, but this doesn’t seem relevant for a local network with a very low round trip time. ↩