CONCEPT Cited by 1 source
Singleton workload¶
A singleton workload is a service that runs as a single replica — one pod, one instance, no horizontal redundancy. It may be stateful (a leader-elected controller, a singleton job, a DB primary) or stateless-but-not-scalable (a legacy service that assumes in-process state, a batch processor that can't partition).
Why singletons are structurally hazardous under aggressive autoscalers¶
Modern cluster autoscalers like Karpenter exercise consolidation — proactively re-packing workloads onto fewer, larger nodes and retiring the vacated ones — as a core efficiency mechanism. Consolidation is safe for replicated workloads (the scheduler respects PDBs) but has a known failure mode for singletons:
- Scheduler decides to consolidate pods from node A onto node B.
- Evicts the singleton from node A.
- Replacement pod is scheduled onto node B.
- Between eviction and readiness of the new pod, the service is down. There's no second replica to cover the gap.
PDBs don't help here — a PDB with minAvailable: 1 blocks all
eviction on a 1-replica workload, deadlocking drains entirely.
Salesforce's canonical surfacing¶
Salesforce named this explicitly as a migration-uncovered hazard:
"The team discovered that Karpenter's efficient bin-packing and consolidation features could unexpectedly impact applications running single-replica pods, leading to service disruptions in critical scenarios. To address this, we began implementing guaranteed pod lifetime features and workload-aware disruption policies to safeguard these singleton workloads." (Source: sources/2026-01-12-aws-salesforce-karpenter-migration-1000-eks-clusters)
The framing they draw: "effective auto scaling solutions must balance infrastructure efficiency with application availability requirements, particularly for mission-critical services." — singletons are where that trade-off becomes visible.
Mitigation mechanisms¶
- Make them non-singletons. Scale to
replicas: 2+with an active/standby or active/active pattern, then PDB-protect. The cheapest fix when the workload can be replicated. - Guaranteed pod lifetime — Karpenter (+ Kubernetes) primitives that annotate pods as "don't disrupt for consolidation during lifetime T." Lets the singleton finish its work without being re-packed.
- Workload-aware disruption policies — annotation / label schemes that convey "this pod is a singleton" to the scheduler so consolidation skips it.
karpenter.sh/do-not-disruptannotation — Karpenter- specific escape hatch that opts a pod out of consolidation.- Node selector pinning — pin singletons to a dedicated node pool with consolidation disabled.
Each mitigation trades efficiency for availability; the right mix depends on how critical the singleton is and how expensive its replacement boot is.
Related¶
- concepts/pod-disruption-budget — the primitive that can't help alone here.
- concepts/bin-packing — the mechanism that exposes singletons.
- systems/karpenter — the autoscaler whose consolidation behavior creates the hazard.
- patterns/disruption-budget-guarded-upgrades — the compound pattern that singletons break and require extension of.
Seen in¶
- sources/2026-01-12-aws-salesforce-karpenter-migration-1000-eks-clusters — Salesforce's Karpenter rollout surfaced singletons as a first-class hazard requiring guaranteed-pod-lifetime + workload- aware disruption policies, not just PDBs.