CONCEPT Cited by 1 source
Availability Zone balance¶
Availability Zone (AZ) balance is the property of a workload
whose replicas are evenly distributed across the cloud provider's
AZs, so that the failure of any single AZ removes only 1/N of the
replicas (not all of them, and not a majority).
For a 3-AZ region, ideal balance is 33/33/33 per AZ. "Poor AZ balance" means replicas are clumped: e.g. 60/20/20 or 80/10/10. A zone failure against a 80/10/10 distribution takes out 80% of the workload.
Why it's non-trivial to achieve¶
At the cluster-autoscaler layer, AZ balance depends on:
- Capacity distribution of the instance types the autoscaler is
allowed to provision — if
m5.8xlargeis scarce inus-east-1abut plentiful inus-east-1b, a single-instance-type autoscaler will pile into 1b. - Node-group / ASG topology — ASGs distribute across their configured subnets but don't bin-pack across instance sizes; a node-group-per-shape setup ends up with each node group's distribution independently drifting.
- Pod-level anti-affinity — Kubernetes
topologySpreadConstraintscan enforce spread at the pod level, but only if the nodes are actually spread.
Why ASG-driven autoscaling fails at it (at scale)¶
Salesforce explicitly named poor Availability Zone balance as one of the ASG-era limitations that motivated their Karpenter migration:
"These challenges were further exacerbated by structural limitations in the Auto Scaling group–based architecture, including poor Availability Zone balance and performance bottlenecks in large clusters, particularly for memory-intensive workloads." (Source: sources/2026-01-12-aws-salesforce-karpenter-migration-1000-eks-clusters)
At scale, per-node-group ASG drift stacks across thousands of ASGs (Salesforce had 1,180+ node pools pre-migration) and the aggregate distribution becomes unpredictable. Individual ASGs can be balanced while the overall cluster is not.
How Karpenter addresses it¶
Karpenter's scheduler sees the whole cluster at once: pending pods, existing node distribution per AZ, allowed instance types. It can pick the AZ + instance type combination that most improves overall balance as each node is provisioned, instead of growing a specific ASG that happens to match the shape.
Heterogeneous instance types inside one NodePool (GPU / ARM / x86
all valid for certain workloads) widen the capacity pool per AZ, so
some valid instance is usually available in the under-represented
AZ.
Related¶
- concepts/blast-radius — AZ balance is what bounds blast radius to ~1/N for zone failures.
- concepts/bin-packing — the scheduler primitive that has to consider AZ as a dimension.
- systems/karpenter — the scheduler that Salesforce moved to for better AZ balance.
- systems/aws-auto-scaling-groups — the predecessor primitive that didn't balance well at thousands-of-ASGs scale.
Seen in¶
- sources/2026-01-12-aws-salesforce-karpenter-migration-1000-eks-clusters — AZ balance named explicitly as a structural ASG limitation Karpenter fixes.