SYSTEM Cited by 5 sources

Fly Sprites¶

Definition¶

Fly.io's durable, checkpointable, Anycast-addressed per-user micro-VM primitive for coding-agent tenancy, announced 2026-01-09. The product tagline: "BIC disposable cloud computers" — single-user cloud Linux boxes that create in 1-2 seconds, hold 100 GB by default, go idle (stop metering) but don't die, expose an HTTPS URL via Fly Anycast, and support first-class checkpoint/restore with ~1s restore latency.

Distinguishing property vs. Fly Machines: Sprites are designed around durability rather than horizontal-scaling-statelessness. The substrate is explicitly not Fly Machines — "They have an entirely new storage stack. They're orchestrated differently. No Dockerfiles."

Disclosed capabilities (2026-01-09)¶

Property	Value
Create latency	1-2 seconds ("about the same as `ssh` into an existing host")
Default storage	100 GB
Durability	Indefinite — "don't die until I tell them to" — weeks / months
Idle	Auto-sleep; auto-stop-metering (scale-to-zero variant)
Network	Fly Anycast HTTPS URL per Sprite
Checkpoint create	"Completes instantly" (`sprite-env checkpoints create`)
Checkpoint restore	~1 second (`sprite checkpoint restore <name>`)
CLI	`sprite create`, `sprite console`, `sprite-env checkpoints create`, `sprite checkpoint restore`
Shell access	Root by default; agent + user share the shell
Dockerfile	Not used — different image-composition story (unspecified)
Substrate	"Entirely new storage stack", "orchestrated differently" from Fly Machines
Product entry	sprites.dev
Target usage	"I use dozens", "casually create hundreds of them"
Workload ceiling	Not million-user production apps (explicit — "you wouldn't want to ship an app to millions of people on a Sprite")

Design thesis — "Claude Doesn't Want A Stateless Container"¶

The product's argumentative core is that ephemeral read-only sandboxes are the wrong abstraction for coding agents. Stateless-container architectures buy "simplicity, flexible scale-out, and reduced failure blast radius" at the cost of forcing state into external services (databases, S3, Redis). This trade works for professional developers who are trained to design around it. It fails for coding agents, which "work best when you find a way to let [them] zap [themselves]" — they want durable storage that survives across tasks, they want to install ffmpeg once and keep it, they want to trust a written file to still be there tomorrow.

See concepts/durable-vs-ephemeral-sandbox for the full axis. See sources/2026-01-09-flyio-code-and-let-live for Ptacek's canonical statement.

First-class checkpoint / restore¶

The property that rescues durable from fragile. Ptacek's demo: rm -rf $HMOE/bin, dd if=/dev/random of=/dev/vdb, ill-advised global pip3 install — "everything's broken. So: sprite checkpoint restore v1". One second later the Sprite is back to the pre-damage state. Ptacek's framing:

Not an escape hatch. Rather: an intended part of the ordinary course of using a Sprite. Like git, but for the whole system.

The canonical wiki page is concepts/first-class-checkpoint-restore. The key claim is that cheap-to-create + cheap-to-restore + ordinary-course UX turns system-state into something you can version-control casually, rather than an escape-hatch-for-emergencies.

Anycast + HTTPS URL per Sprite¶

Each Sprite is addressable on Fly.io's Anycast network with its own HTTPS URL. Ptacek's kid-MDM Sprite exposes its Anycast URL as an MDM-registration endpoint for APNS-protected device enrollment — "It all just works." Same Anycast substrate underneath systems/fly-proxy, systems/flycast, and Phoenix.new's *.phx.run preview URLs, but exposed at the per-user-durable-VM granularity.

Relationship to Fly Machines¶

Fly.io's self-critique in the post:

We built a platform for horizontal-scaling production applications with micro-VMs that boot so quickly that, if you hold them in exactly the right way, you can do a pretty decent code sandbox with them. But it's always been a square peg, round hole situation.

Sprites are the round-peg-for-the-round-hole. Named differences vs. Fly Machines:

Storage stack — entirely new (details deferred to a follow-up post).
Orchestrator — different (details deferred).
Image composition — no Dockerfile (details deferred).
Target workload shape — single-user durable dev/ prototype/personal-app vs. Fly Machines' many-replica stateless production service.

systems/fly-machines stays in Fly.io's lineup for horizontal-scaling apps; Sprites take over the sandbox / coding-agent / personal-app use case.

Relationship to Phoenix.new¶

systems/phoenix-new (2025-06-20) was Fly.io's first productised agent-tenant VM — ephemeral per-session, Elixir/ Phoenix-specific, root shell shared between user and agent, full Chrome + *.phx.run preview URLs. Sprites generalise the shape along two axes:

Beyond Elixir — Sprites are language-agnostic (no pre-installed framework); the sprite create/apt-get flow is the composition surface.
Beyond per-session — Sprites are indefinite-duration; session-end doesn't imply VM-death.

The canonical pattern distinction is patterns/durable-micro-vm-for-agentic-loop (Sprites shape) vs. patterns/disposable-vm-for-agentic-loop (Phoenix.new shape).

Target usage shape¶

The post disclosures a personal-scale usage pattern Ptacek explicitly normalises:

"I use dozens" — one Sprite per task / project / agent session kept alive for as long as useful.
"Casually create hundreds of them" — capacity implicit.
Claude-driven Sprites "building and testing examples for the API one at a time" (documentation site for the Sprites API was built this way — compute + network time would have blown ephemeral-sandbox budgets).
"Vibe-coded MDM" running for a month on a single Sprite handling APNS Push certificates for Ptacek's kids' devices.

The per-user fleet scale claim ("dozens" / "hundreds") depends on Sprites' "go idle and stop metering automatically" behaviour — if idle Sprites weren't metered down, keeping dozens alive wouldn't be economically viable.

Operational numbers not disclosed¶

Kernel-level isolation primitive (Firecracker? A different VMM? Hardware-virt vs. container-with-strong-isolation?) — post-silent.
Checkpoint storage location, eviction, retention defaults.
Wake-up latency from idle (gestured at "tries already there" but not numbered).
Idle-detection criteria (CPU? Network? TTY?).
Per-user or per-org storage ceiling beyond the 100 GB default.
Concurrent-user / throughput ceiling on a single Sprite.
Pricing structure (idle rate, active rate, storage rate, network rate, checkpoint rate).
Comparison numbers vs. E2B, Modal, Daytona, Runloop, Replit Agent VMs, Cloudflare Sandbox SDK, AWS Firecracker- based offerings.

All deferred to the promised follow-up post ("we wrote another 1000 words about how they work, but I cut them out").

Seen in¶

sources/2026-01-09-flyio-code-and-let-live — launch post, Ptacek's "Fuck Ephemeral Sandboxes" manifesto, canonical wiki source for Sprites.
[[sources/2026-01-14-flyio-the-design-implementation-of- sprites]] — implementation deep-dive, 5 days after launch. Discloses the three orchestration decisions (no-user- container-image + warm pool; JuiceFS-derived object-store- rooted storage with SQLite+Litestream metadata; inside-out orchestration with inner-container-hosted user code). Confirms Sprites run on Fly Machines today with an open- source local runtime in progress. Orchestrator is an Elixir/ Phoenix app using object storage for account metadata + per- account SQLite-via-Litestream. Anycast URL propagation uses existing Corrosion.
sources/2026-01-29-flyio-litestream-writable-vfs — block-map disclosure refinement. Names the Sprite metadata DB specifically "the block map" — a (file, chunks → object-store keys) map, "low tens of megabytes worst case". Discloses that the storage stack must serve writes milliseconds after Sprite boot during request-driven Sprite wake. Introduces the two Litestream VFS mechanisms that make the boot budget feasible: writable mode (LITESTREAM_WRITE_ENABLED=true — single-writer, ~1 s buffered sync, eventual durability) and background hydration (LITESTREAM_HYDRATION_PATH=... — dm-clone-style local-file fill while serving remote reads). Also confirms the global Sprites orchestrator runs Litestream with per-org SQLite databases directly on S3-compatible object storage ("unlike our flagship Fly Machines product, which relies on a centralized Postgres cluster"). The block-map use is the canonical production consumer of patterns/writable-vfs-with-buffered-sync and patterns/background-hydration-to-local-file.

Implementation highlights (from 2026-01-14)¶

The launch post promised a follow-up on "how they work"; [[sources/2026-01-14-flyio-the-design-implementation-of- sprites|this is it]]. Three load-bearing decisions:

1. No user-facing container image + warm pool¶

"Sprites get rid of the user-facing container. Literally: problem solved. […] Every physical worker knows exactly what container the next Sprite is going to start with, so it's easy for us to keep pools of 'empty' Sprites standing by. The result: a Sprite create doesn't have any heavy lifting to do; it's basically just doing the stuff we do when we start a Fly Machine."

The 1-2 s create latency advertised at launch is warm-pool dequeue, not honest-cold-boot. See concepts/no-container-image-sprite, [[concepts/warm-sprite- pool]], patterns/warm-pool-zero-create-path.

2. Object storage as disk root¶

The 100 GB durable storage is backed by S3-compatible object storage, not attached NVMe. NVMe is a sparse, disposable read-through cache in front of immutable chunks:

"Our stack sports a dm-cache-like feature that takes advantage of attached storage. A Sprite has a sparse 100GB NVMe volume attached to it, which the stack uses to cache chunks to eliminate read amplification. Importantly (I can feel my resting heart rate lowering) nothing in that NVMe volume should matter; stored chunks are immutable and their true state lives on the object store."

Storage-stack shape: a hacked-up JuiceFS with a rewritten SQLite metadata backend kept durable via Litestream. See concepts/object-storage-as-disk-root, concepts/metadata-data-split-storage, concepts/read-through-nvme-cache, patterns/read-through-object-store-volume, patterns/metadata-plus-chunk-storage-stack.

Durable state of a Sprite is a URL — migration and physical-failure recovery are pointer-move operations. Checkpoint and restore are metadata-shuffle operations: "like a git restore, not a system restore". "(our pre-installed Claude Code will checkpoint aggressively for you without asking)." See concepts/durable-state-as-url, concepts/fast-checkpoint-via-metadata-shuffle, patterns/checkpoint-as-metadata-clone, concepts/fleet-drain-operation.

3. Inside-out orchestration¶

"In the cloud hosting industry, user applications are managed by two separate, yet equally important components: the host, which orchestrates workloads, and the guest, which runs them. Sprites flip that on its head: the most important orchestration and management work happens inside the VM. Here's the trick: user code running on a Sprite isn't running in the root namespace. We've slid a container between you and the kernel."

The VM's root namespace hosts the storage stack, service manager, log pipeline, port-forwarding proxy, and platform-API handler. The user sees an inner container; the platform operates from the outer namespace. Key properties: bounce without kernel reboot (inner container restart is cheap), platform-changes blast-radius = new VMs (existing VMs keep running old code until they bounce), and per-Sprite global API ("you're talking directly to your own VM"). See concepts/inside-out-orchestration, patterns/inside-out-vm-orchestration, patterns/blast-radius-in-vm-not-host.

Global orchestrator (2026-01-14 disclosure)¶

Not a VM, not a worker-side service — a Elixir/Phoenix app:

"The global orchestrator for Sprites is an Elixir/Phoenix app that uses object storage as the primary source of metadata for accounts. We then give each account an independent SQLite database, again made durable on object storage with Litestream."

The per-account SQLite pattern extends Litestream's wildcard- replication story from [[sources/2025-05-20-flyio-litestream- revamped|Litestream revamped]] into a multi-tenant orchestrator.

Integration with Fly.io's existing substrate¶

Sprites plug into Fly.io's existing Corrosion SWIM-gossip state- distribution system for Anycast HTTPS URLs:

"When you ask the Sprite API to make a public URL for your Sprite, we generate a Corrosion update that propagates across our fleet instantly. Your application is then served, with an HTTPS URL, from our proxy edges."

The public URLs served via systems/fly-proxy + Corrosion — the same substrate Fly Machines use. No new state-distribution system for Sprites.

Sprites as contract, not substrate¶

"Sprites are a contract with user code: an API and a set of expectations about how the execution environment works. Today, they run on top of Fly Machines. But they don't have to. Jerome's working on an open-source local Sprite runtime. We'll find other places to run them, too."

Today Sprites = Fly Machines underneath. Tomorrow could be a different substrate. The contract is the product.

Positioning vs. Fly Machines (2026-01-14)¶

Sprites and Fly Machines are different optimisation points:

LSVD (disk-off-object-storage for Fly Machines) perf ceiling is inadequate for hot Postgres; LSVD stays experimental, Sprites take the object-store route for their use case.
"Professional production apps ship out of CI/CD systems as OCI containers; that's a big part of what makes orchestrating Fly Machines so hard."
"Most (though not all) Sprite usage is interactive, and Sprite users benefit from their VMs aggressively sleeping themselves to keep costs low; e-commerce apps measure responsiveness in milliseconds and want their workloads kept warm."

Prescribed workflow: "prototype and acceptance-test an application on Sprites. Then, when you're happy with it, containerize it and ship it as a Fly Machine to scale it out. An automated workflow for that will happen."

2026-02-04 disclosure: the Sprite "block map" is Litestream VFS¶

sources/2026-02-04-flyio-litestream-writable-vfs fills in the previously-vague "rewritten SQLite metadata backend… kept durable with Litestream" phrasing from the 2026-01-14 design post with the actual shape of the Sprite storage stack's metadata tier:

"The system that does this is JuiceFS, and the database — let's call it 'the block map' — is a rewritten metadata store, based (you guessed it) on BoltDB. I kid! It's Litestream SQLite, of course."

Concretely: the Sprite disk stack is a forked JuiceFS where the metadata tier is SQLite + Litestream VFS, running in writable + hydration mode (the two new 2026-02-04 VFS features). The block map is the file → chunks → object-store keys lookup table JuiceFS normally keeps in a Redis / MySQL / Postgres / TiKV, reimplemented with:

Object storage as durable root — the block map, like Sprite storage generally, roots at S3-compatible object store.
Page-level reads via HTTP Range GETs against LTX files during cold boot.
Background hydration to a local temp file for steady-state read throughput (concepts/async-clone-hydration at SQLite-database granularity).
Writable VFS with single-writer semantics and ~1 s buffered sync to object storage — matching the "eventual durability" property the rest of the Sprite storage stack already exhibits.

Size profile: "Block maps aren't huge, but they're not tiny; maybe low tens of megabytes worst case."

Gating latency constraint: "If the Fly Machine underneath a Sprite bounces, we might need to reconstitute the block map from object storage… this is happening while the Sprite boots back up… that's something that can happen in response to an incoming web request; that is, we have to finish fast enough to generate a timely response to that request. The time budget is small." Read-only VFS cleared cold-boot but "not fast enough for steady state" — hydration closes that gap.

This is the first wiki instance of patterns/metadata-plus-chunk-storage-stack where the metadata tier is itself object-store-rooted (via Litestream VFS) rather than running on traditional local-disk storage, a recursive application of the metadata/data split pattern. It also nails down the BoltDB-vs-SQLite storage-choice framing Fly has repeatedly discussed — the Sprite block map went SQLite (Johnson's "I kid! It's Litestream SQLite, of course" is the punchline).

Global orchestrator: per-org Litestream SQLite¶

The 2026-02-04 post also sharpens the global-orchestrator description from 2026-01-14:

"Litestream SQLite is the core of our global Sprites orchestrator. Unlike our flagship Fly Machines product, which relies on a centralized Postgres cluster, our Elixir Sprites orchestrator runs directly off S3-compatible object storage. Every organization enrolled in Sprites gets their own SQLite database, synchronized by Litestream."

One SQLite DB per organization, object-store-rooted, Litestream- synced. Explicitly contrasted with Fly Machines' centralized Postgres. "It takes advantage of the 'many SQLite databases' pattern, which is under-appreciated. It's got nice scaling characteristics. Keeping that Postgres cluster happy as Fly.io grew has been a major engineering challenge." Second wiki instance of the wildcard-replication shape (first was /data/*.db in the 2025-05-20 revamp post).

2026-03-10 disclosure: Sprites speak MCP (`sprites.dev/mcp`)¶

sources/2026-03-10-flyio-unfortunately-sprites-now-speak-mcp ships an official remote MCP server for Sprites at sprites.dev/mcp, plugs into Claude Desktop or any MCP-speaking client, authenticates into a single Fly.io organization, and exposes Sprite lifecycle operations (create, manage, checkpoint, delete) as MCP tool calls. See systems/sprites-mcp for the system-page treatment.

Three-axis agent-creation guardrail¶

On authentication, the operator sets three quotas that bound the MCP session's blast radius:

Org scope. A single Fly.io organization per MCP session. Injected instructions cannot cross org boundaries.
Sprite-count cap. Operator-set maximum on spawned-Sprite count.
Name prefix. Operator-set string prefix on spawned Sprites — "so you can easily spot the robots and disassemble them."

First wiki instance of a three-axis agent-creation quota guardrail at the VM-lifecycle altitude (prior concepts/ai-agent-guardrails instances were at code-review, CLI-refusal, or allowlist altitudes).

Ptacek's MCP-is-aesthetically-wrong framing¶

The post's other half is a canonical Fly.io statement of the MCP-vs-CLI-skills thesis: "In 2026, MCP is the wrong way to extend the capabilities of an agent. The emerging Right Way to do this is command line tools and discoverable APIs." For shell-capable agents (Claude Code, Codex, Gemini CLI), Fly.io's preferred surface is the sprite CLI + a one-sentence skill ("Use this skill whenever users want to create a new VM to run a task on, or to manage the VMs already available"), not the MCP server. See concepts/progressive-capability-disclosure and concepts/context-as-importance-signal for the two load-bearing costs motivating the CLI-first stance.

The MCP server is shipped as a fallback for shell-less agents (Claude Desktop and similar) — see patterns/mcp-as-fallback-for-shell-less-agents. Fly.io ships both surfaces because the audience mix genuinely needs both; the positional recommendation is that shell-capable agent users should prefer the CLI.

"Fuck Stateless Sandboxes" reprise¶

The post closes with a reprise of the 2026-01-09 "Fuck Ephemeral Sandboxes" manifesto under a reframed banner: "Fuck Stateless Sandboxes." Semantic shift: ephemeral (lifetime — dies at session end) → stateless (reachability — no persistent filesystem, no real network). The arguments are parallel: the industry default for agent compute is too impoverished; agents want real computers with real filesystems connected to real networks. See concepts/durable-vs-ephemeral-sandbox.

Example prompts (2026-03-10)¶

Ptacek's "On a new Sprite, do…" prompt template, with representative tasks:

"On a new Sprite, reproduce this bug from issues/913, capturing logs."
"On a new Sprite, benchmark this function across 1000 runs."
"On 3 new Sprites, change this service to use each of these 3 query libraries."
"On a new Sprite, run a load generator against this endpoint for 60 seconds."

All would overshoot a 15-minute ephemeral sandbox. The durable-VM property continues to be load-bearing.

concepts/durable-vs-ephemeral-sandbox — the axis Sprites canonicalise.
concepts/first-class-checkpoint-restore — the property Sprites canonicalise.
concepts/fast-vm-boot-dx — composed with durability in the Sprite claim.
concepts/scale-to-zero — Sprites' idle-not-dead metering posture.
concepts/anycast — Sprites' HTTPS-URL substrate.
concepts/agentic-development-loop — target workload.
concepts/agent-with-root-shell — root-shell tenancy continued from Phoenix.new.
concepts/vibe-coding — long-running personal-app extension.
patterns/durable-micro-vm-for-agentic-loop — the pattern Sprites realise.
patterns/disposable-vm-for-agentic-loop — the prior posture this product revises.
systems/fly-machines — sibling substrate Sprites explicitly diverge from.
systems/firecracker — candidate kernel-isolation primitive (not confirmed).
systems/phoenix-new — Fly.io's earlier per-session agent-VM product; Sprites generalise it.
systems/fly-proxy — Anycast-routing substrate.
systems/juicefs — storage-stack model Sprites fork.
systems/litestream — metadata-DB durability substrate.
systems/litestream-vfs — the block map runs on Litestream VFS (writable + hydration modes, per 2026-02-04).
systems/sqlite — metadata-DB backend.
systems/boltdb — Johnson's "based on BoltDB. I kid!" misdirect — SQLite was chosen; tied to concepts/bolt-vs-sqlite-storage-choice framing.
systems/corrosion-swim — Anycast-URL propagation.
systems/phoenix-framework — global orchestrator substrate.
systems/aws-s3 — object-store shape for chunks.
systems/lsvd — the Fly-Machine-side object-storage experiment Sprites route around.
concepts/no-container-image-sprite
concepts/warm-sprite-pool
concepts/object-storage-as-disk-root
concepts/read-through-nvme-cache
concepts/metadata-data-split-storage
concepts/inside-out-orchestration
concepts/inner-container-vm
concepts/fast-checkpoint-via-metadata-shuffle
concepts/durable-state-as-url
concepts/fleet-drain-operation
patterns/warm-pool-zero-create-path
patterns/read-through-object-store-volume
patterns/metadata-plus-chunk-storage-stack
patterns/vfs-range-get-from-object-store — the page-level read shape the block map relies on at Sprite cold boot.
concepts/async-clone-hydration — the hydration-design shape Litestream VFS imports for the block map's steady-state path.
concepts/bolt-vs-sqlite-storage-choice — the storage- choice question Johnson name-checks in the "Litestream SQLite, of course" punchline.
patterns/inside-out-vm-orchestration
patterns/checkpoint-as-metadata-clone
patterns/blast-radius-in-vm-not-host
systems/sprites-mcp — the 2026-03-10 vendor-hosted remote MCP server for Sprite CRUD.
systems/model-context-protocol — the protocol sprites.dev/mcp uses.
concepts/ai-agent-guardrails — the three-axis org/cap/prefix guardrail triple.
concepts/progressive-capability-disclosure — the CLI-skills alternative the 2026-03-10 post argues for.
concepts/context-as-importance-signal — the aesthetic-cost axis Ptacek names.
concepts/local-mcp-server-risk — the local-MCP risk that sprites.dev/mcp remotes away.
patterns/mcp-as-fallback-for-shell-less-agents — the positional pattern sprites.dev/mcp instantiates.
patterns/remote-mcp-server-via-platform-launcher — sibling remote-MCP shape.
companies/flyio — source company.