Skip to content

PATTERN Cited by 1 source

Durable micro-VM for agentic loop

Shape

Run an LLM-driven coding agent loop inside a persistent, per-user (or per-project) micro-VM that survives across agent sessions — the VM keeps installed packages, written files, long-running state — with first-class checkpoint / restore as the blast-radius-recovery mechanism instead of VM-replacement. Four load-bearing properties:

  1. Durable across sessionsapt-get install ffmpeg runs once, not once per session; the VM's state is the agent's memory.
  2. Checkpoint / restore as ordinary course — destructive mistakes (bad pip3 install, rm -rf, agent-induced breakage) are recovered by restoring a recent checkpoint, not by spinning up a fresh VM. "Like git, but for the whole system."
  3. Fast boot — 1-2 second create latency so the VM feels ephemeral-in-creation even though it's durable in lifecycle; see concepts/fast-vm-boot-dx.
  4. Idle auto-stops metering — (concepts/scale-to-zero applied to a durable substrate) the VM isn't de-allocated when the user walks away; just de-metered. On next access it's still there with state intact.

Canonical post

Thomas Ptacek's 2026-01-09 Sprites-launch manifesto, sources/2026-01-09-flyio-code-and-let-live:

Stop killing your sandboxes every time you use them. […] Someone asked me about this the other day and wanted to know if I was saying that agents needed sound cards and USB ports. And, maybe? I don't know. Not today. […] A computer doesn't necessarily vanish after a single job is completed, and it has durable storage. Since current agent sandboxes have neither of these, I can stop the definition right there.

Contrast with

disposable-VM-for-agentic-loop

The two patterns sit on opposite ends of the concepts/durable-vs-ephemeral-sandbox axis. They share the same micro-VM isolation substrate (systems/firecracker- class kernel-enforced boundary), the same agent-with-root- shell tenancy posture, the same fast-boot DX requirement. They differ on persistence:

Axis Disposable VM Durable VM (this pattern)
Lifetime Per session Indefinite (weeks, months)
Blast-radius recovery VM replacement Checkpoint restore
node_modules rebuild Every session Once
Installed packages Session-ephemeral Persist
Written files Session-ephemeral Persist
Clean-slate By construction Via checkpoint restore
Idle cost Zero (VM dies) Stops-metering-not-dies
Per-user storage Minimal ~100 GB default (Sprite)
Agent can see prior iteration state No Yes (via filesystem + logs)
Default compute model Every invocation starts fresh Picks up where left off

Ptacek's 2026-01-09 argument: when durable-VM with casual checkpoint is available as a product, the disposable-VM pattern's cost-of-ephemerality (node_modules rebuilds, out-of- sandbox S3/Redis for state, plan-file-as-key-value-store) becomes "unnecessary" for the typical coding-agent workload. Disposable-VM remains the right fit for specific workloads (CI-like test runs, safety-critical untrusted code, one-shot evaluations, security-first contexts).

Canonical wiki substrate

systems/fly-sprites — Fly.io's 2026-01-09 launched product, explicit canonical instance. Claims:

  • ~1-2s create latency
  • 100 GB default storage per Sprite
  • Anycast HTTPS URL per Sprite
  • ~1s checkpoint restore
  • "Go idle and stop metering automatically"
  • Not Fly Machines — "entirely new storage stack. Orchestrated differently. No Dockerfiles."
  • "I use dozens" (Ptacek's personal scale)
  • "Casually create hundreds of them" (capacity claim)

Galaxy-brain extension: "dev is prod, prod is dev"

Ptacek's 2026-01-09 post extends the pattern beyond coding-agent scratchpad VMs into a class of long-running personal-use apps:

I have kids. They have devices. I wanted some control over them. So I did what many of you would do in my situation: I vibe-coded an MDM. […] It's a SQLite-backed Go application running on a Sprite. The Anycast URL my Sprite exports works as an MDM registration URL. Claude also worked out all the APNS Push Certificate drama for me. It all just works. […] I've been running this for a month now, still on a Sprite, and see no reason ever to stop. […] For this app, dev is prod, prod is dev.

The durable-micro-VM pattern, at the "kept alive for a month" end, blurs into a personal-app hosting pattern for single-user-or-small-family applications that don't want the operational surface of real production deployment but do want "dev is prod" pragmatics. Extends concepts/vibe-coding from throw-away-prototype framing to long-running-personal-app framing.

Implementation ingredients

  • Fast-boot micro-VM substrate — see concepts/fast-vm-boot-dx; Firecracker-class boot latencies. Sprites claim 1-2s create, aligning with the DX bar.
  • Persistent storage attached per VM — the Sprite default is 100 GB; substrate must support per-user volume mounting that survives idle/resume.
  • First-class checkpoint primitive — not a bolt-on snapshot API; a CLI command on the user-facing surface (sprite-env checkpoints create, sprite checkpoint restore v1). See concepts/first-class-checkpoint-restore.
  • Idle-stops-metering mechanism — so keeping "dozens" of durable Sprites is economically viable. Sprites "go idle and stop metering automatically"; wake-on- access is implicit in the UX.
  • Anycast + HTTPS URL per VM — makes durable VMs hostable at a stable network name without explicit port-exposure ceremony; essential for the "dev is prod" shape where the Sprite's URL is an integration endpoint (MDM registration, webhook receiver, personal app home).
  • Agent-with-root-shell tenancy (see concepts/agent-with-root-shell) — the agent runs as a full root-shell tenant; same as Phoenix.new but durable.

Workloads this pattern fits

  • Long-running coding-agent loops — Ptacek's Claude building the Sprites API documentation one endpoint at a time; compute + network time would have blown ephemeral- sandbox budgets.
  • Agents that need to iterate on the app's lifecycle"an agent running on an actual computer can exploit the whole lifecycle of the application" (Phoenix.new's agent-reads-app-logs mechanism is the prior instance; Sprites generalise it).
  • Personal apps — MDMs, family tooling, home-lab adjacent use cases, single-user webhooks, APNS-certificate- bearing services; scale ceiling is measured in users of one family, not users of a million.
  • Long PR iteration cycles — per-PR Sprite that stays alive while the PR is in review; picked back up for follow-up comments without re-provisioning.

Workloads this pattern does not fit

  • Safety-first untrusted-code evaluation — the clean- slate-by-construction property of ephemeral sandboxes is itself the safety story.
  • CI-style reproducible test runs — reset-by-default matches the workload.
  • Million-user production apps — Ptacek is explicit: "you wouldn't want to ship an app to millions of people on a Sprite." The workload ceiling (not numbered) is below production-scale.
  • One-shot agent harness"99th percentile sandboxed agent run probably needs less than 15 minutes"; for that case the cost of durability isn't earned.
  • Stateless request-per-invocation compute (Lambda- shape) — the whole point of Lambda is to amortise a VM across many invocations by keeping the VM ephemeral-per- invocation; durable per-user doesn't fit.

Seen in

  • sources/2026-01-09-flyio-code-and-let-live — canonical wiki source. Ptacek's "Fuck Ephemeral Sandboxes" manifesto announcing Sprites as the canonical durable- micro-VM-for-agent product. First-principles argument + product demo + personal-app use case in the same post.
Last updated · 319 distilled / 1,201 read