Skip to content

ZALANDO 2024-06-17

Read original ↗

Source: Zalando — Next level customer experience with HTTP/3 traffic engineering

Summary

Zalando engineer Dmitry Kolesnikov's 2024-06-17 post on HTTP/3 adoption for media-content delivery at Zalando. Two halves: (1) an architectural deep-dive into why HTTP/3 + QUIC were designed the way they were — framed around two root-cause buckets (heterogeneous IP-network utilisation; TCP-era protocol-design inefficiencies) — and (2) a production rollout datum on Zalando's own HTTP/3 deployment (36.6 % of users migrated, ~94 % average-latency improvement, ~96 % p99 latency improvement vs HTTP/2, 0 incidents). The post also names open engineering problems that remain even after QUIC: multiplexing head-of-line artefacts above the transport, memory-management costs of user-space implementations, middlebox hostility to UDP, congestion-control ambiguity (NewReno as standard default, CUBIC as dominant reality, BBR as the 5G-ready algorithm), and IPv4 MTU mismatches below QUIC's 1280-byte floor. Closes with two forward-looking axes — Deep Reinforcement Learning for congestion control (Aurora, Eagle, Orca, PQB) and selectively reliable transport for 4K video streaming — as the research agenda Zalando anticipates driving the next HTTP/3 wave.

Key takeaways

  1. HTTP/3 fuses TCP + TLS + stream-multiplexing into one user- space transport over UDP. The fusion collapses cold-path round trips: "Each 'cold' HTTP/2 request demands about 5 to 6 round-trips (1xDNS, 1xTCP, 3xTLS, 1xHTTP). HTTP/3 requires 3 round-trips (1xDNS, 1xQUIC, 1xHTTP)." QUIC combines cryptographic + transport parameter negotiation, permits application data exchange as early as one RTT, and sets up additional streams with no extra handshake.
  2. Root-cause decomposition: infrastructure vs protocol. The post explicitly splits Web-stack performance problems into (1) IP-network infrastructure utilisation — ~15 hops for EU → eu-central-1 traffic, heterogeneous ISP / Tier-1 / Tier-2 / Tier-3 carriers, radio-spectrum saturation at the access network — and (2) protocol design — excessive signalling, TCP head-of-line blocking, kernel-side congestion control, TLS / transport layer separation, MTU fragmentation. Assumption: infrastructure factors are unchanged for 3–5 years (economic lag), protocol-design improvements are where customer experience gains are actually obtainable.
  3. QUIC's congestion-control story is deliberately unfinished. QUIC "does not aim to standardise the congestion control algorithms" — the spec proposes NewReno (1999) as default but the post names this "confusing," because CUBIC (2008) is the dominant production algorithm for broad internet traffic and BBR (2016) is the one positioned to dominate in future / on 5G. Ownership of congestion control is shifting from hardware vendors (NIC + OS kernel) to software vendors (browser + library authors) — opening innovation surface, including Deep-RL approaches (Aurora, Eagle, Orca, PQB).
  4. QUIC connection migration is a mobile-network native primitive. Quote: "QUIC connections are not strictly bound to a single network path. The protocol supports the connection transfer to a new network path, ensuring a low-latency experience when consumers switch from mobile to WiFi. In the case of HTTP, it always requires a 'cold' start." This is the distinguishing handover capability — not achievable over TCP because TCP connection identity is the 4-tuple (src-ip, src-port, dst-ip, dst-port).
  5. Zalando production datum: 36.6 % HTTP/3, 61.6 % HTTP/2, 1.8 % HTTP/1 fallback. Verbatim: "36.6 % of our users seamlessly migrated to content consumption using HTTP/3 protocol. The average latency for these customers has improved from double digit to single digit value giving about 94 % improvements. The p99 latency has improved from 4th digit value to double digit giving 96 % gain in comparison with HTTP/2. [...] 1.8 % of users fall back to HTTP/1. No incidents or severe anomalies caused by HTTP/3 have been observed by us." The distribution is a usable baseline for any engineering team sizing their own HTTP/3 rollout expectations.
  6. 5G radio-access characteristics invalidate NewReno/CUBIC. "High-reliability plays against the congestion control algorithms used by QUIC. Conventional algorithms are not able to differentiate between the potential causes of packet loss or congestion on the radio channel due to noise, interference, blockage or handover. NewReno and CUBIC have resulted in very poor throughput and latency performance. Only BBR exhibited the lowest round trip time values among all possible physical failure scenarios and can satisfy the typical 5G requirements." The RAN-layer characteristics determine which CC algorithm is workable — this becomes a deployment-time choice, not a protocol-level default.
  7. QUIC inherits new engineering costs as it sheds TCP's. Four named drawbacks still requiring engineering: (a) stream multiplexing still vulnerable at the packet layer — datagram loss still blocks the streams coalesced into it, requiring application-layer traffic prioritisation; (b) memory pressure — out-of-order buffering + user- space kernel-memory-copy overhead; (c) middlebox hostility — ISPs apply different routing / QoS / AQM policies to UDP; operators need reconfiguration (Facebook's Linux-kernel UDP-packet-processing bottleneck, new LB + FW policies); (d) MTU floor of 1280 bytes — matches IPv6 but causes IPv4 fragmentation on non-standard-MTU networks, especially radio channels. PMTUD required for larger datagrams.
  8. Selectively reliable transport is the next HTTP/3 frontier for real-time video. 4K UHD at 60 fps (H.265, 30–50 Mbps, 6–11 ms latency, 99.999 % reliability) is tough even on 5G mid-bands. QUIC's reliable-by-default stream semantics cause stalls when retransmissions block video frame delivery. Zalando names the frontier: "selectively reliable transport wherein not all video frames are delivered reliably" — allow drop over retransmit for certain frames to preserve smoothness — as the research axis that would unlock True-4K real-time video streaming over QUIC.

Systems extracted

  • Amazon CloudFront — Zalando's HTTP/3 termination point for European customers at AWS Edge Locations. Named as the key enabler.
  • AWS Network Load Balancer — supports QUIC (UDP) load-balancing; named as infrastructure enabler.
  • Akamai Technologies — cited as QUIC- supporting since 2016 (the deepest industry-track record).
  • Cronet — Android open-source QUIC/HTTP-3 library; Google Chrome QUIC support since 2012; Apple Safari 14; Firefox May 2021; iOS 15 proprietary QUIC.

Concepts extracted

  • HTTP/3 — extended with Zalando's production rollout datum + the expanded RAN / congestion- control / MTU problem catalogue.
  • QUIC transport — user-space flow + congestion control over UDP. Decoupled from OSI layering; cooperation with application layer (TLS-fused, stream-aware).
  • QUIC connection migration — connection-ID-based, not 4-tuple-based; survives network-path changes.
  • QUIC-TLS fused handshake — collapse transport + cryptographic parameter negotiation into one RTT.
  • Stream multiplexing with independent flow control — QUIC streams are transport-layer independent; loss on one stream does not HoL-block the others.
  • User-space congestion control — CC algorithm lives in the QUIC implementation (library / browser), not the kernel.
  • CUBIC congestion control (RFC 8312, 2008) — dominant internet CC algorithm today.
  • BBR congestion control (2016) — measures bottleneck bandwidth + RTT rather than inferring loss; named as 5G- capable CC algorithm and future internet dominant.
  • NewReno congestion control (RFC 6582, 1999) — QUIC's formal default; named as "confusing" choice given CUBIC's broad deployment.
  • Radio-access-network (RAN) bottleneck — on 5G mid-bands, the physical-link radio-spectrum capacity is the bottleneck; blockages from buildings, vehicles, trees, and human body at mmWave.
  • 5G mid-bands (3.3–4.2 GHz) — Zalando's stated assumption for European deployment, balancing capacity vs coverage vs CAPEX.
  • MTU 1280-byte floor — QUIC's IETF-standardised minimum, matches IPv6, causes fragmentation on non-standard IPv4.
  • UDP middlebox hostility — ISPs apply different routing / QoS / AQM policies to UDP than to TCP.
  • Deep-Reinforcement-Learning (DRL) congestion control — CC as a learnable policy; Aurora, Eagle, Orca, PQB named.
  • Selectively-reliable video transport — drop-frame tolerance instead of retransmit; stall avoidance for real-time streams.

Patterns extracted

  • User-space transport over kernel transport — shift from kernel TCP to user-space QUIC as the architectural pattern; opens innovation surface, incurs per-packet copy + kernel- cross overhead.
  • CDN as HTTP/3 termination point — Zalando relies on CloudFront to terminate HTTP/3 at the edge; QUIC is a CDN-platform problem, not a per-service problem. Similar adoption shape at Akamai / Cloudflare / Google.
  • Fallback to HTTP/2 over TCP on UDP hostility — a production HTTP/3 deployment must handle both transports; 1.8 % of Zalando users fall through to HTTP/1 still.

Operational numbers

  • Internet traffic mix: 85 % TCP; 54.6 % HTTP (of internet) per Cloudflare Radar; 54.4 % of HTTP is to mobile devices.
  • HTTP/3 adoption share: 29.8 % of websites worldwide serve HTTP/3 (w3techs at publication time).
  • QUIC adoption share: 8.0 % of websites (w3techs).
  • Cold HTTP/2 handshake cost: ~5–6 RTTs (1 DNS + 1 TCP + 3 TLS + 1 HTTP).
  • Cold HTTP/3 handshake cost: ~3 RTTs (1 DNS + 1 QUIC + 1 HTTP) with 1-RTT effective wait for application data.
  • Hop count: ~15 hops EU → eu-central-1.
  • Mobile-app user abandonment: ~70 % stop using an app if it loads too slowly.
  • 5G peak data rate: up to 20 Gbps; average > 100 Mbps; up to 1M devices/km² device density.
  • 5G mid-band latency: 13 ms at p99.9 downlink, 28 ms at p99.9 uplink in real deployment (even with bad signal -100 dBm to -113 dBm).
  • 4K UHD 60fps requirement: 30–50 Mbps, 6–11 ms packet latency, 99.999 % reliability.
  • HEVC-compressed 4K video: ~5.6 MB / second.
  • Zalando HTTP/3 rollout:
  • 36.6 % of users on HTTP/3.
  • 61.6 % on HTTP/2.
  • 1.8 % fall back to HTTP/1.
  • ~94 % average-latency improvement (double-digit → single- digit ms).
  • ~96 % p99-latency improvement (4-digit → double-digit ms).
  • 0 incidents or severe anomalies from the rollout.
  • MTU floors: IPv6 standard 1280 bytes (= QUIC minimum); Ethernet common 1500 bytes; larger requires PMTUD.
  • Facebook QUIC deployment gotchas (cited): client-side TCP heuristic; download-bandwidth estimation heuristic; Linux- kernel UDP packet-processing bottleneck; new LB + FW policies required.

Caveats

  • Post is 2024-06; Zalando production numbers are from Zalando only — no cross-site breakdown of the 36.6/61.6/1.8 shares. The 94 %/96 % improvement figures are relative (double- digit → single-digit, 4-digit → double-digit) rather than absolute ms values, so comparison to another provider's numbers requires caution.
  • No disclosure of Zalando's congestion-control choice — post cites NewReno, CUBIC, BBR in the landscape but doesn't state which CC algorithm Zalando runs for their HTTP/3 media traffic. Given CloudFront termination, it's implicitly AWS's defaults (likely BBR-family, but unconfirmed).
  • No description of Zalando-specific engineering — the post is industry / protocol-landscape analysis + one rollout datum; no custom HTTP/3 proxy, no in-house QUIC library, no CC-algorithm tuning described. Zalando's HTTP/3 story is a customer of the CDN + AWS Edge + Linux stack, not a systems-engineering effort at the transport level.
  • DRL-for-CC section is research-forward — Aurora / Eagle / Orca / PQB are named as literature references, not as production systems. No Zalando in-house experiment is claimed.
  • Selectively-reliable-transport for video is aspirational — the post positions it as a future direction, not something Zalando has implemented.
  • No observability story — unlike Slack's 2026-03-31 post on HTTP/3 probing gap (concepts/http-3-probing-gap), Zalando does not describe how they monitor HTTP/3 health. Whether they hit the same probing gap and how they closed it is unstated.
  • MTU / middlebox claims are industry-common rather than Zalando-specific — cited via the Facebook engineering post rather than Zalando's own rollout.

Source

Last updated · 501 distilled / 1,218 read