SYSTEM Cited by 2 sources

Karpenter¶

Karpenter is an open-source (CNCF) Kubernetes node auto-scaler originally built by AWS. It watches pending pods, solves a bin-packing problem over configured instance types / zones / architectures, and dynamically provisions and de-provisions cloud instances directly — replacing the older Cluster Autoscaler model (which scales pre-defined node groups via ASGs) with per-instance control driven by actual pending pods.

Core design primitives¶

NodePool — declarative CRD that describes a pool of eligible nodes (limits, disruption policy, taints, labels, supported workloads).
EC2NodeClass — AWS-specific companion CRD describing the EC2-level details: allowed instance types, root volume size / IOPS / type / throughput, subnets, security groups, AMI family.
Bin-packing scheduler — picks the instance type + AZ + count that most efficiently absorbs the current set of pending pods.
Consolidation — continuously re-packs workloads onto fewer, larger nodes and retires the vacated ones; the inverse of scale-down-by-utilization-heuristic in CA.
Drift — detects when a running node's configuration has drifted from its NodePool / EC2NodeClass spec (e.g. AMI change) and replaces it.

Contrast with Cluster Autoscaler¶

The canonical wiki writeup is on the other page (systems/cluster-autoscaler). Summary:

	Cluster Autoscaler	Karpenter
Capacity primitive	ASG (AWS)	Direct EC2 `RunInstances`
Scaling latency	Minutes	Seconds
Instance diversity	One template per ASG	Many types per `NodePool`
AZ balance	ASG-driven (poor)	Scheduler-driven
Consolidation	Utilization-heuristic	Continuous bin-packing

What it solves¶

Flat over-provisioning of node pools that had to handle both steady-state load and deploy surges — expensive on nights and weekends.
Rigid node-group boundaries — Karpenter can span many instance types / zones / arch to match pod resource shapes in one NodePool.
Scale-down inefficiency of static fleets — continuous consolidation retires under-utilized nodes.
Subnet-pinned node pools — decoupling provisioning from specific subnets improves IP efficiency.
Poor AZ balance — the scheduler sees the whole cluster and picks under-represented AZs as it provisions.
Multi-minute scaling latency — pending-pod-driven provisioning collapses to seconds.

Pairs naturally with systems/keda or HPA for pod-level scaling: pods scale first, Karpenter scales nodes under them.

Known hazards (from production)¶

From Salesforce's 1,000-cluster migration (2026-01-12):

Bin-packing + consolidation can terminate singleton pods without warning. Mitigation: guaranteed-pod-lifetime features, workload-aware disruption policies, karpenter.sh/do-not-disrupt annotation.
PDB misconfigurations become migration-blocking. Overly restrictive or broken PDBs block node replacement. Fix upstream: audit + OPA-enforced PDB admission validation.
[[concepts/kubernetes-label-length-limit|63-character label limit]] can be unexpectedly breaking at scale because Karpenter's NodePool / EC2NodeClass matching is label-dependent — legacy human-friendly naming conventions often exceed the limit.
Ephemeral-storage defaults are not implicit. Moving from ASG to EC2NodeClass requires 1:1 volume-config translation; incomplete translation causes workloads to fail to schedule.
Parallel cordoning destabilizes clusters. Use sequential cordoning with verification checkpoints instead.

Scale references¶

Figma — scoped Karpenter into the ECS→EKS migration for cost savings; used with systems/keda for the pod layer (fast-follow).
Salesforce — 1,000+ EKS clusters / 1,180+ node pools / thousands of internal tenants. Canonical largest-known Karpenter production reference. Reports:
Scaling latency minutes → seconds.
80% reduction in manual operational overhead.
5% FY2026 cost savings, projected +5-10% FY2027.
Heterogeneous GPU / ARM / x86 in single node pools.
Eliminated thousands of node groups.
Datadog's State of Containers report (referenced in the 2026-01-12 post): +22% Karpenter-provisioned node share in the last 2 years across surveyed Kubernetes fleets.

Seen in¶

sources/2024-08-08-figma-migrated-onto-k8s-in-less-than-12-months — Figma scoped node-level auto-scaling (via Karpenter) into the ECS→EKS migration because cost savings justified the added scope for little extra work. Pod-level auto-scaling (Keda) was deferred to a fast-follow.
sources/2026-01-12-aws-salesforce-karpenter-migration-1000-eks-clusters — Salesforce's 1,000-cluster migration from Cluster Autoscaler + ASGs to Karpenter. Canonical wiki reference for Karpenter at extreme scale; documents the five operational lessons (PDB hygiene, sequential cordoning, 63-char labels, singleton protection, ephemeral-storage mapping), the three design principles of the in-house transition tool (zero-disruption + rollback + CI/CD-integrated), the automated ASG→Karpenter config mapping approach, and the rollout strategy (phased with soak times, risk-based sequencing).

systems/cluster-autoscaler — the autoscaler Karpenter is displacing.
systems/aws-auto-scaling-groups — the legacy capacity primitive Karpenter bypasses.
systems/aws-eks — the typical runtime.
systems/kubernetes — the orchestrator it scales.
systems/aws-ec2 — the cloud compute under it.
systems/keda — the usual pod-layer companion.
concepts/bin-packing — Karpenter's core algorithm.
concepts/scaling-latency — the metric Karpenter wins on.
concepts/pod-disruption-budget — Karpenter's safety contract during consolidation.
concepts/singleton-workload — the workload class Karpenter's consolidation can harm.
concepts/availability-zone-balance / concepts/ip-address-fragmentation — two platform-scale wins surfaced at Salesforce.
concepts/kubernetes-label-length-limit — migration-blocker at Salesforce scale.
patterns/disruption-budget-guarded-upgrades — the compound safety pattern Karpenter depends on customers to configure.
patterns/sequential-node-cordoning — Salesforce's operational lesson for node-replacement campaigns.