Skip to content

Performance Tuning

klzgrad edited this page Aug 18, 2024 · 16 revisions

Linux kernel

Window sizes for large bandwidth-delay links

The TCP window limits throughput over high latency links (throughput <= window size / RTT). In Linux kernel, the window sizes are auto-tuned by the congestion control algorithm and limited by the maximum window sizes. The maximum window sizes are calculated from tcp_rmem, tcp_wmem, and tcp_adv_win_scale. Under the default effect of tcp_adv_win_scale, receive window = tcp_rmem / 2.

The default values of tcp_rmem (max) and tcp_wmem (max) are calculated from RAM size and always smaller than 6MB and 4MB. Thus the default maximum receive and send windows are 3MB and 2MB, which are enough for most cases but too small for fat links with e.g. 1Gbps bandwidth and >16ms RTT.

The window sizes should be tuned to the actual BDP = Link speed * RTT. Example: Assuming 1Gbps link with 256ms RTT, it's a 32MiB maximum window size requiring 64MiB maximum buffer size. Add to sysctl.conf:

  • (Client only) net.ipv4.tcp_rmem = 4096 131072 67108864
  • (Server only) net.ipv4.tcp_wmem = 4096 131072 67108864

BBR should be able to auto-tune the window size to reduce bufferbloat. Assuming large download and small upload, client-side net.ipv4.tcp_wmem and server-side net.ipv4.tcp_rmem can be left as default. net.core.rmem_max and net.core.wmem_max limits manual buffer settings and are not used in window size auto-tuning and can be left as default.

Use BBR congestion control

sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

HTTP/2 uses a single connection per host. TCP tuning is no longer needed to reduce bufferbloat.

Turn off tcp_slow_start_after_idle

sudo sysctl -w net.ipv4.tcp_slow_start_after_idle=0

This setting can improve persistent single connection performance slightly.

(Server only) Consider setting tcp_notsent_lowat

sudo sysctl -w net.ipv4.tcp_notsent_lowat=131072

This setting can improve interactive latency by optimizing send buffer handling. Note that this kernel setting is useful for HTTP/2 and may be detrimental to other applications, as it applies to all applications on the server.

Be cautious with this setting. Small values tend to increase server CPU usage and negatively affect throughput. You need to benchmark it to determine good settings. You can check the HTTP/2 load time in https://http2.akamai.com/demo while also downloading a large file with and without this setting.

Common values range from 16KB to 128KB.

Do not turn on TCP Fast Open

Its Linux implementation is too conservative to be useful, and its usage is rare in practice thus creating a distinct traffic feature.

Chromium

Chromium limits the number of connections per proxy to 32. New connections that exceed this limit will be stalled, but multi-tab browsing often needs more than 32 connections. Create a policy to override this limit.

Note: This is a browser setting, not applicable to any proxies.

  • For Chromium on Linux:
sudo mkdir -p /etc/chromium/policies/managed
echo '{ "MaxConnectionsPerProxy": 99 }' | sudo tee /etc/chromium/policies/managed/proxy.json
  • For Chrome on macOS:
defaults write com.google.Chrome MaxConnectionsPerProxy -int 99

and restart Chrome to update the policy.

You should be able to see it in chrome://policy once set up.

99 is the maximum value for MaxConnectionsPerProxy allowed by Chromium. It is still too low. You are recommended to use an ad-blocker to save on connections.

For Chrome or other OSes, see:

Clone this wiki locally