PATTERN Cited by 1 source
Dynamic subsetting load balancer¶
Intent¶
Auto-tune the subset size (the "aperture") that an on-host proxy balances across, so each proxy maintains connections to only a fraction of the total backend pool — and that fraction scales with how much load this host contributes to the destination service. Reduces connection counts, health-check overhead, and hot-spotting in a large service mesh without requiring human tuning.
Motivation¶
In a service mesh with thousands of callers and thousands of backends, a naive "every proxy knows every backend" design produces:
- Millions of idle connections (each proxy × each backend).
- Health-check amplification (each proxy health-checks every backend).
- Subsetting with a fixed aperture is an improvement, but the right aperture depends on the caller's QPS to that service — and that changes over time.
Uber's dynamic-subsetting design¶
From Uber's 2022 post (via sources/2022-07-11-highscalability-stuff-the-internet-says-on-scalability-for-july-11th-2022):
"Subsetting in the context of load balancing means dividing the set of potential backend tasks into overlapping 'subsets,' so that while all backends receive traffic in aggregate, each proxy performs the load balancing on a limited set of tasks."
The dynamic insight:
"if an on-host proxy knows how much QPS a callee service is receiving, it could derive the ratio of load it is contributing to the overall traffic. With this information, an on-host proxy could decide its subsetting size dynamically based on ratio (i.e., it should expand its subsetting size if it starts to make more requests to a destination service)."
Production results¶
Uber ran the system for ~18–12 months across millions of containers:
- 15–30% P99 CPU utilization reduction on 8 larger manually- tuned services after dynamic-subsetting rollout.
- Zero complaints from service owners about subsetting in the 18–12 months since the rollout — down from frequent manual-tuning conversations before.
The second metric — operator-pain elimination — is the harder win; dynamic-subsetting removed an entire class of on-call work.
Related¶
- systems/envoy — the canonical on-host L7 proxy in service meshes where dynamic subsetting applies.
- concepts/tail-latency-spike-during-queueing — fixed- aperture subsetting contributes to this failure mode when apertures are mis-sized.
- sources/2022-07-11-highscalability-stuff-the-internet-says-on-scalability-for-july-11th-2022.
- companies/highscalability.