Skip to content

PATTERN Cited by 1 source

Warm pool, zero-work create path

Problem

A VM / compute primitive has a user-visible create operation, and the DX requires it to feel instantaneous — sub-2-second, ideally sub-second. The honest create path (provision networking, fetch image, stage filesystem, apply config, start VM) takes many seconds. Users don't want to pay that latency on every new workload.

Pattern

Pre-create VMs on every worker in a warm pool, and serve create requests by dequeuing from the pool. The platform pays the honest create cost before the user asks; the user-visible create is dequeue + bind + start.

Preconditions:

  1. Uniform VM image — every pooled VM looks identical at pool-creation time. Without uniformity, per-user pools balloon. See concepts/no-container-image-sprite.
  2. Cheap idle VMs — pooled-but-unassigned VMs must cost ~nothing (idle-metering off, scale-to-zero economics). See concepts/scale-to-zero.
  3. Cheap binding of user state — moving user data / identity onto a warm VM at dequeue must be fast. In Sprites this is: point the storage stack at the user's URL, start the inner container, propagate the Anycast URL via gossip.

When all three hold, create latency drops to the cost of dequeue + bind + start — which is substantially less than boot-from-cold.

Canonical wiki instance — Fly.io Sprites

"Now, today, under the hood, Sprites are still Fly Machines. But they all run from a standard container. Every physical worker knows exactly what container the next Sprite is going to start with, so it's easy for us to keep pools of 'empty' Sprites standing by. The result: a Sprite create doesn't have any heavy lifting to do; it's basically just doing the stuff we do when we start a Fly Machine."

(Source: [[sources/2026-01-14-flyio-the-design- implementation-of-sprites]])

The user-visible effect: sprite create dkjsdjk is a 1-2s operation, fast enough that "the experience of creating and shelling into one is identical to SSH'ing into a machine that already exists."

Relationship to Fly Machines' start-vs-create

The earlier Fly Machines design split the slow part (create) from the fast part (start) and asked the user to keep a pool of stopped Machines. Sprites internalise the pool-management contract: one verb (sprite create), platform handles the pool.

Operational knobs

Undocumented for Sprites, but inherent to the pattern:

  • Pool depth per worker. Too shallow → demand spikes hit cold boot; too deep → idle-VM memory/CPU overhead.
  • Refill cadence. Eagerly refill on dequeue (fixed pool size) vs. lazy refill under low-load (variable pool).
  • Pool warming post-restart. When a worker restarts / upgrades, does it boot a fresh pool pre-emptively or on first demand?
  • Pool-empty fallback. Cold-boot a new VM? Borrow from a neighbouring worker? Return an error?

Other instances

  • AWS Lambda execution environments — Lambda holds warm per-tenant micro-VMs. The "provisioned concurrency" feature is explicitly a per-tenant warm pool exposed as a control knob. Different in that pools are per-tenant (because code varies); same in shape.
  • Kubernetes HPA with readyReplicas > 0 at rest — ops-level warm-pool for pods.
  • Database connection pools — same pattern at the process- level rather than VM-level.
  • GC page pools / freelist caches — warm pool at the memory-allocator level.

Sprites' novelty is the single-pool-serves-all-users shape, enabled by dropping the user-container-image decision.

Trade-offs

  • Capacity reservation without demand certainty. Warm pools cost memory / scheduler slots on every worker whether or not users show up.
  • Version-skew headache. Pooled VMs run whatever version the platform shipped at pool-creation time. A platform update is a gradual pool churn, not an atomic flip.
  • Wasted work on pool churn. If the worker restarts / gets evicted before its pool is used, the pre-create work was wasted.
  • Pool starvation under regional spike. If demand spikes in one region, the warm-pool buffer is quickly exhausted — the user-visible latency regresses to cold-boot.

Seen in

  • [[sources/2026-01-14-flyio-the-design-implementation-of- sprites]] — canonical wiki instance.
Last updated · 319 distilled / 1,201 read