SYSTEM Cited by 1 source
kube-proxy¶
kube-proxy is the Kubernetes node-level component that implements the default Service load balancing. It watches the Kubernetes API for Service / Endpoint changes and programs kernel-level rules (iptables, IPVS, or eBPF depending on mode) so that packets sent to a ClusterIP virtual IP are rewritten to one of the backend pod IPs.
How it works (default ClusterIP mode)¶
- Layer 4 only. Decisions happen on the kernel's packet path; kube-proxy doesn't parse HTTP / gRPC.
- Per-connection decision. The backend pod is picked once when the TCP connection is established; every subsequent packet on that connection goes to the same pod.
- Basic algorithms. Round-robin or random selection; no weights, no topology awareness, no per-request decisions.
Why this fails for gRPC / HTTP/2¶
gRPC runs on long-lived HTTP/2 connections. Because kube-proxy picks a pod per connection, not per request:
- A client that has one HTTP/2 connection to service X keeps sending every request to the same backend pod, regardless of that pod's load.
- Across many clients, the aggregate distribution is skewed: some pods get orders of magnitude more traffic than others.
- Tail latency rises (hot pods saturate), capacity planning becomes guessing (even though average utilization looks fine, p99 blows up).
- kube-proxy can't fix this — it's architecturally L4 and stateless per-connection.
The standard workarounds push LB up to L7: concepts/client-side-load-balancing, concepts/layer-7-load-balancing, or a sidecar/edge proxy.
Seen in¶
- sources/2025-10-01-databricks-intelligent-kubernetes-load-balancing — Databricks cites kube-proxy's per-connection L4 model as the root cause of traffic skew on their Scala/gRPC fleet; they bypass it entirely in favor of client-side LB via a custom xDS control plane (systems/databricks-endpoint-discovery-service).