Skip to content

CONCEPT Cited by 1 source

Congestion window

The congestion window (often cwnd) is the transport- protocol-level cap on how many bytes the sender may have in flight but unacknowledged at any moment. It's a per-connection variable owned by the sender's congestion controller (TCP's kernel stack or QUIC's user-space stack) and it enforces: "don't put more data on the network than the network has proven it can deliver."

For CDN performance engineering, cwnd is one of the dominant handles on per-connection throughput during ramp-up, and therefore on the perceived latency of a page-load transfer.

Why it matters for latency

Even at high link-layer bandwidth and zero loss, a connection can only send cwnd bytes per RTT. A cold TCP connection starts at an initial cwnd (iw) — historically 3 MSS, updated by RFC 6928 (2013) to 10 MSS — then grows. During the slow-start phase, cwnd doubles every RTT until it hits a threshold or loss is detected.

Practical consequence: for small transfers that finish before slow start completes (i.e. most web sub-resources: 10-100 KB HTML / CSS / JS), the effective throughput is iw-dominated, not link-bandwidth-dominated. A transfer of N MSS finishes in roughly log₂(N / iw) + 1 RTTs.

Raising iw (sanely) shortens this. Cloudflare's 2026-04-17 post (Source: sources/2026-04-17-cloudflare-agents-week-network-performance-update) names congestion-window management as a primary software-axis lever on connection time:

"By leveraging protocols like HTTP/3 and changing how we manage congestion windows, we can reduce processing time by milliseconds in code, in addition to the improvements on the wire."

The RFC-6928 shift in one line

  • Pre-2013 TCP: initial cwnd = 3 MSS → a 14.4-KB response needs 2 RTTs to finish even over infinite bandwidth.
  • Post-2013 TCP: initial cwnd = 10 MSS → the same response finishes in 1 RTT. For a user 50 ms from the edge, that's 50 ms saved on every cold connection fetching a small response.

CDNs that raise iw further (Google pushed for 10 → then tuned higher; Cloudflare tunes per-workload) trade a small risk of short-term overshoot for measurable shorter cold transfers.

cwnd in QUIC / HTTP/3

QUIC's congestion controller lives in user space (inside the HTTP/3 server implementation), so the CDN owns the congestion-control logic end-to-end. This lets a vendor:

  • Deploy modern algorithms (BBR, BBRv2/v3) without waiting on kernel upgrades.
  • Tune per-workload: agentic-tool-call small requests vs. large video segments want different ramp-up strategies.
  • A/B-test controller changes globally.

The trade is that every packet's congestion-control accounting is now CPU in the proxy process, making the core proxy's CPU efficiency load-bearing (see concepts/hot-path). Cloudflare's connection-handling hot-path posts (Pingora, FL2 proxy) are partly about making per-packet work cheap enough that aggressive cwnd policies are economic.

  • Slow start — exponential ramp-up phase; cwnd doubles each RTT until ssthresh.
  • Congestion avoidance — linear ramp-up after ssthresh; AIMD (additive-increase, multiplicative-decrease) in classic TCP Reno / CUBIC.
  • BBR — Google's model-based congestion controller; aims for the bandwidth-delay product rather than responding to loss. Popular at CDN scale.
  • Pacing — spreading the cwnd send allowance over an RTT instead of bursting — critical on high-speed paths.
  • Receive window (rwnd) — the receiver's cap; effective send rate is min(cwnd, rwnd). Modern kernels auto-tune rwnd; cwnd is the sender's concern.

Failure modes

  • Bufferbloat. Too-large cwnd on a loss-based controller (TCP Reno, CUBIC) fills intermediate buffers and adds queueing latency far exceeding the original round-trip — RTT balloons, actual goodput collapses. Modern model-based controllers (BBR) mitigate but don't eliminate.
  • Initial-window overshoot on capped last-mile. iw = 10 on a 1-Mbps mobile link is already too much; tail retransmits cost the user latency.
  • Per-stream vs per-connection accounting (HTTP/2). HTTP/2 multiplexes many streams on one TCP connection, sharing a single cwnd; a slow large stream can starve small time-sensitive ones. HTTP/3 fixes this at the transport layer — see concepts/http-3.

Wiki framing

Congestion-window management is one of the named software- axis levers in the CDN-performance playbook (the other: protocol upgrades like HTTP/3; the orthogonal infrastructure-axis lever: PoP densification). Tuning cwnd shortens cold-connection transfer time per RTT; densifying PoPs shortens the RTT itself. The two compose multiplicatively on user-perceived page-load time.

Seen in

  • sources/2026-04-17-cloudflare-agents-week-network-performance-update — canonical wiki instance. Cloudflare cites "changing how we manage congestion windows" as one of the software- side levers (alongside HTTP/3) that contributed to the Sept → Dec 2025 shift from 40 % to 60 % of top networks where Cloudflare is fastest — distinct from the PoP- deployment wins at Wroclaw / Malang / Constantine.
Last updated · 200 distilled / 1,178 read