Skip to content

SYSTEM Cited by 1 source

kube-proxy

kube-proxy is the Kubernetes node-level component that implements the default Service load balancing. It watches the Kubernetes API for Service / Endpoint changes and programs kernel-level rules (iptables, IPVS, or eBPF depending on mode) so that packets sent to a ClusterIP virtual IP are rewritten to one of the backend pod IPs.

How it works (default ClusterIP mode)

  • Layer 4 only. Decisions happen on the kernel's packet path; kube-proxy doesn't parse HTTP / gRPC.
  • Per-connection decision. The backend pod is picked once when the TCP connection is established; every subsequent packet on that connection goes to the same pod.
  • Basic algorithms. Round-robin or random selection; no weights, no topology awareness, no per-request decisions.

Why this fails for gRPC / HTTP/2

gRPC runs on long-lived HTTP/2 connections. Because kube-proxy picks a pod per connection, not per request:

  • A client that has one HTTP/2 connection to service X keeps sending every request to the same backend pod, regardless of that pod's load.
  • Across many clients, the aggregate distribution is skewed: some pods get orders of magnitude more traffic than others.
  • Tail latency rises (hot pods saturate), capacity planning becomes guessing (even though average utilization looks fine, p99 blows up).
  • kube-proxy can't fix this — it's architecturally L4 and stateless per-connection.

The standard workarounds push LB up to L7: concepts/client-side-load-balancing, concepts/layer-7-load-balancing, or a sidecar/edge proxy.

Seen in

Last updated · 200 distilled / 1,178 read