CONCEPT Cited by 4 sources
Durable vs ephemeral sandbox¶
Definition¶
A design-axis distinction for VM/container sandboxes hosting coding agents: does the sandbox persist across agent sessions (durable) or vanish at session end (ephemeral)?
The axis crystallised when Fly.io shipped Sprites (2026-01-09) and Thomas Ptacek explicitly framed the industry's ephemeral-sandbox default as "obsolete": "Stop killing your sandboxes every time you use them." The page documents the trade-off — each end of the axis has a legitimate use case, and the Sprite-era re-framing clarifies rather than retires either choice.
The two ends¶
Ephemeral sandbox¶
Substrate: per-session micro-VM or container; read-only filesystem overlay or tmpfs-backed rootfs; session-end triggers VM-death. Prior wiki canonical instances: Phoenix.new (2025-06-20), the 2025-02-07 VSCode-SSH-bananas sketch of a clean-slate VM for agentic loops (patterns/disposable-vm-for-agentic-loop).
Properties:
- Clean slate every session — no drift, no residual state from yesterday's broken attempt.
- Bounded blast radius by construction — destructive commands can't escape the VM; the VM is going away anyway.
- Small per-agent storage footprint — nothing to persist, nothing to back up.
- Predictable resource footprint — no long-running VMs, no accumulated cruft.
Costs (per Ptacek's 2026-01-09 post):
node_modulesrebuilds every session — real dollar cost + real wall-clock cost. "The industry is spending tens of millions of dollars figuring out how to snapshot and restore ephemeral sandboxes" to mitigate this.- State gets round-tripped through external services — agents can't trust a local file to persist, so infrastructure appears outside the sandbox (S3 / Redis / RDS) solely to give the agent a durable store. "They're building infrastructure to work around the fact that they can't just write a file and trust it to stay put. Gross."
- Time-limited by design — 15-minute sandbox windows work for token-bound tasks but collapse on compute-bound or network-bound workloads (API client integration, large data processing, long test suites with many external calls).
- "Plan files" as state-escape-hatch — agents "round-trip state through 'plan files' which are ostensibly prose but often really just egregiously-encoded key-value stores" because there's no durable VM to store it in.
Durable sandbox¶
Substrate: persistent per-user (or per-project) VM that survives session end; installed packages, written files, and long-running state all persist indefinitely; user explicitly terminates when done. Canonical wiki instance: Fly Sprites (2026-01-09).
Properties:
- Install once, keep forever —
apt-get install ffmpegruns once, not once per session. - Lifecycle access for agents — agents can see logs from the last run, written-then-read files survive the gap, prior iterations inform current ones. "An agent running on an actual computer can exploit the whole lifecycle of the application" (Ptacek, citing Phoenix.new's agent-reads- app-logs mechanism).
- Long-running workload compatible — no 15-minute clock to race; Claude-on-a-Sprite building the Sprites API documentation one endpoint at a time is the canonical use case.
- Cost shape: pay-for-what-you-keep — viable only combined with idle-stops-metering (see scale-to-zero applied to a durable substrate).
Costs:
- Drift — yesterday's broken attempt is still there in the filesystem, biasing today's agent; the clean-slate property is forfeit by default.
- Blast-radius management requires a different mechanism — VM-replacement is no longer the safety story; concepts/first-class-checkpoint-restore is the canonical substitute.
- Secret / credential accumulation risk — durable VMs hold user credentials across sessions; compromise implications are bigger than ephemeral.
- Storage cost — 100 GB per Sprite (Sprite default) per user multiplied by "dozens" per user.
Ptacek's axis-collapsing claim¶
The 2026-01-09 post argues the default-to-ephemeral posture was load-bearing — but only because durable with cheap blast-radius recovery didn't exist as a product. Now that it does (Sprites with ~1s checkpoint restore), the argument for ephemeral as the default sandbox-for-agents collapses on three grounds:
- Clean-slate is purchasable via checkpoint/restore.
Pre-damage checkpoint +
sprite checkpoint restoregives you clean-slate UX on demand, without paying it every session. - Blast-radius is still bounded — not by VM replacement, but by VM restore.
- The
node_modules/ external-infrastructure / plan-file costs go away by construction.
"Instead of figuring [those problems] out, just use an actual computer."
Decision shape (post-Sprite)¶
| Workload | Preferred sandbox shape |
|---|---|
| Agent evaluation harness (one-shot, safety-first) | Ephemeral — clean-slate-by-construction is the safety story |
| Coding agent iterating on a feature (hours-days) | Durable — lifecycle access, no node_modules rebuild |
| Long-running personal app ("dev is prod, prod is dev") | Durable — the Sprite is the app home |
| CI-style test runner (minutes, reproducible) | Ephemeral — reset-by-default matches the workload |
| API-client integration work (network-time-bound) | Durable — 15-min sandboxes won't fit |
| Security-boundary-critical untrusted-code execution | Ephemeral — the isolation posture is stronger |
The axis is orthogonal to other sandbox axes (capability-based vs ambient-authority; per-request vs per-session; micro-VM vs container-with- strong-isolation). A durable sandbox can still be capability- based; an ephemeral sandbox can still be a micro-VM.
What doesn't change with Sprite-era framing¶
- Ephemeral is still the right fit for many workloads. Ptacek's argument is specifically about coding-agent loops, and even there he notes "the 99th percentile sandboxed agent run probably needs less than 15 minutes" — the ephemeral default is correct for the common case.
- The isolation boundary is orthogonal. Durable sandboxes still need kernel-enforced isolation between user VMs (Sprites don't relax the Firecracker-style isolation Fly Machines enforce — they change the persistence story, not the containment story).
- Per-invocation VM isolation (AWS Lambda) is still ephemeral-by-design. Lambda's per-invocation Firecracker-backed VM isolation is a different workload shape than agent loops; the durable-sandbox argument doesn't apply there.
Seen in¶
- sources/2026-01-09-flyio-code-and-let-live — canonical wiki source. Ptacek's "Fuck Ephemeral Sandboxes" manifesto announcing Sprites as the durable alternative. Frames ephemeral as "obsolete" for coding-agent loops; names the three hidden costs (node_modules rebuilds, external-infra for state, plan-file-as-KV-store); demos the durable+checkpoint combination as the recovery posture that replaces VM-replacement.
- sources/2025-02-07-flyio-vscodes-ssh-agent-is-bananas — the ephemeral-default wish-list Fly.io was arguing for ~11 months before the Sprites announcement. "A clean-slate Linux instance that spins up instantly." Historical context: before Sprites, clean-slate-by-VM- replacement was the best blast-radius story Fly.io had.
- sources/2025-06-20-flyio-phoenixnew-remote-ai-runtime-for-phoenix — the ephemeral-per-session product (Phoenix.new) that preceded Sprites. Sprites don't retire Phoenix.new; Phoenix.new occupies the ephemeral end of the axis, Sprites occupy the durable end.
- sources/2026-03-10-flyio-unfortunately-sprites-now-speak-mcp — "Fuck Stateless Sandboxes" reprise. Ptacek's 2026-03-10 post restates the 2026-01-09 manifesto under an evolved banner: "The industry is stuck on 'sandboxes' as a way of letting agents run code, and sandboxes aren't good enough anymore. What agents want is real computers, with real filesystems, connected to real networks, and there's no technical reason not to give them some." Semantic shift: ephemeral (lifetime — dies at session end) → stateless (reachability — no persistent filesystem, no real network). The two framings compose: both name ways the industry's default agent- compute abstraction is too impoverished. The example prompts in the 2026-03-10 post ("On a new Sprite, benchmark this function across 1000 runs", "run a load generator against this endpoint for 60 seconds", "update all the dependencies on this project to their newest versions and test") are all durable-favoring — all overshoot the 15-minute ephemeral-sandbox budget named in the 2026-01-09 post.
Related¶
- concepts/first-class-checkpoint-restore — the load-bearing property that makes durable-without-fragile work.
- concepts/agentic-development-loop — the workload that motivates the axis.
- concepts/agent-with-root-shell — the tenancy posture shared by both ends of the axis (Phoenix.new and Sprites both give the agent root).
- concepts/fast-vm-boot-dx — precondition for either end (ephemeral needs fast-boot to be viable per-session; durable needs fast-boot-*-from-checkpoint to make restore casual).
- concepts/capability-based-sandbox — orthogonal authority-layer axis; composes with both ephemeral and durable.
- patterns/durable-micro-vm-for-agentic-loop — the pattern on the durable end.
- patterns/disposable-vm-for-agentic-loop — the pattern on the ephemeral end.
- patterns/ephemeral-vm-as-cloud-ide — productised ephemeral variant.
- systems/fly-sprites — canonical durable product.
- systems/phoenix-new — canonical ephemeral product.
- systems/fly-machines — the substrate both products build on (Phoenix.new uses it directly; Sprites use a different orchestrator over a different storage stack).