Fly.io — The Design & Implementation of Sprites¶
Summary¶
Thomas Ptacek's 2026-01-14 implementation-deep-dive on
Sprites, five days after the
[[sources/2026-01-09-flyio-code-and-let-live|2026-01-09
launch-plus-manifesto]]. Where the launch post argued the
thesis (ephemeral sandboxes are the wrong abstraction for
coding agents; "Sprites are BIC disposable computers"), this
post discloses the three big orchestration decisions that
let Sprites ship the durable-plus-instant-create shape without
Fly Machines' baggage: (1)
no user-facing container images — Sprites run from a single
standard container, so every worker holds a pre-created warm
pool and sprite create is "basically just doing the stuff
we do when we start a Fly Machine"; (2) object storage as
root of disk — 100 GB durable storage backed by an S3-
compatible object store via a JuiceFS-derived stack that splits
storage into data chunks (object store) and metadata (a
local SQLite DB kept durable via
Litestream); a sparse 100 GB NVMe volume is a
dm-cache-style read-through cache where "nothing in that
NVMe volume should matter; stored chunks are immutable and their
true state lives on the object store"; durable state of a
Sprite is a URL — migration and failed-physical recovery are
"trivial"; checkpoint and restore are metadata-shuffle
operations ("Checkpoints are so fast we want you to use them
as a basic feature of the system and not as an escape hatch …
like a git restore, not a system restore"); (3) inside-out
orchestration — user code runs in an inner container
slid between the user and the kernel; the root namespace of the
VM hosts the storage stack, service manager, log pipeline, and
port-forwarding proxy; "When you talk to the global API,
chances are you're talking directly to your own VM." The
post's meta-argument: platform-team velocity scales with
blast-radius containment of changes, and the inside-out
model puts most orchestration changes inside the VM — "The
blast radius is just new VMs that pick up the change. We sleep
on how much platform work doesn't get done not because the code
is hard to write, but because it's so time-consuming to ensure
benign-looking changes don't throw the whole fleet into
metastable failure." Also discloses that Sprites' global
orchestrator is an Elixir/Phoenix app using object storage
as the primary metadata store for accounts, with each account
getting its own SQLite DB made durable via Litestream.
Sprites plug into Fly.io's existing
Corrosion gossip service-discovery for Anycast HTTPS URLs:
"When you ask the Sprite API to make a public URL … we
generate a Corrosion update that propagates across our fleet
instantly." Today Sprites run on top of Fly Machines; Ptacek
explicitly frames Sprites as a contract ("an API and a set
of expectations about how the execution environment works")
with an open-source local runtime in progress ("Jerome's
working on an open-source local Sprite runtime"). Closing
position statement: Sprites and Fly Machines are different
optimisation points — "prototype and acceptance-test an
application on Sprites. Then, when you're happy with it,
containerize it and ship it as a Fly Machine to scale it out."
The post lightly gestures at storage-stack internals ("I could
easily do another 1500-2000 words here on the Cronenberg film
Kurt came up with for the actual storage stack, but because
it's in flux, let's keep it simple") — flagged as a caveat.
Key takeaways¶
-
Warm-pool
create. Because every Sprite starts from the same standard container, "every physical worker knows exactly what container the next Sprite is going to start with, so it's easy for us to keep pools of 'empty' Sprites standing by. The result: a Spritecreatedoesn't have any heavy lifting to do; it's basically just doing the stuff we do when westarta Fly Machine." This is the implementation of the 1-2 second create advertised in the launch post — see patterns/warm-pool-zero-create-path. Reframes Fly Machines' [[concepts/fly-machine-start-vs- create|start-vs-create]] split: Sprites collapse user-visible-create onto the platform's pre-paid create. -
The user-facing container image is the enemy. "Fly Machines are approximately OCI containers repackaged as KVM micro-VMs. […] We only murdered user containers because we wanted them dead. Most of what's slow about creating a Fly Machine is containers. I say this with affection: your containers are crazier than a soup sandwich. Huge and fussy, they take forever to pull and unpack." "A truly heartbreaking amount of engineering work has gone into just allowing our OCI registry to keep up with this system." Sprites get rid of the user-facing container image and "literally: problem solved." Canonical wiki statement of concepts/no-container-image-sprite.
-
Object storage as disk root. "Every Sprite comes with 100GB of durable storage. We're able to do that because the root of storage is S3-compatible object storage." Contrast with Fly Volumes: "That storage is NVMe attached to the physical server your Fly Machine is on. […] Attached storage is fast, but can lose data — if a physical blows up, there's no magic what rescues its stored bits." "Worse, from our perspective, is that attached storage anchors workloads to specific physicals. […] It took 3 years to get workload migration right with attached storage, and it's still not 'easy'." Canonical wiki statement of concepts/object-storage-as-disk-root.
-
Metadata/data split, JuiceFS-lineage. "The Sprite storage stack is organized around the JuiceFS model (in fact, we currently use a very hacked-up JuiceFS, with a rewritten SQLite metadata backend). It works by splitting storage into data ('chunks') and metadata (a map of where the 'chunks' are). Data chunks live on object stores; metadata lives in fast local storage. In our case, that metadata store is kept durable with Litestream. Nothing depends on local storage." This is [[patterns/metadata-plus-chunk-storage- stack]] instantiated — the same meta-pattern as Tigris' FoundationDB-metadata + NVMe- byte-cache + S3-origin, with SQLite+ Litestream in the metadata slot.
-
Sparse NVMe as dm-cache-style read-through. "Our stack sports a dm-cache-like feature that takes advantage of attached storage. A Sprite has a sparse 100GB NVMe volume attached to it, which the stack uses to cache chunks to eliminate read amplification. Importantly (I can feel my resting heart rate lowering) nothing in that NVMe volume should matter; stored chunks are immutable and their true state lives on the object store." Canonical wiki statement of concepts/read-through-nvme-cache. Reverses concepts/bus-hop-storage-tradeoff: NVMe stays on the hot path (read latency) but loses its durability role.
-
Durable state of a Sprite is a URL. "In a real sense, the durable state of a Sprite is simply a URL. Wherever he lays his hat is his home! They migrate (or recover from failed physicals) trivially." The 3-year saga Fly.io went through to ship [[patterns/async-block-clone-for-stateful- migration|async block clone]]-based Machine migration evaporates. Drain becomes trivial again: "Before we did Fly Volumes, that was as simple as pushing a 'drain' button on a server. Imagine losing a capability like that." Canonical wiki statement of concepts/durable-state-as-url.
-
Checkpoint / restore as metadata shuffle. "This also buys Sprites fast
checkpointandrestore. Checkpoints are so fast we want you to use them as a basic feature of the system and not as an escape hatch when things go wrong; like a git restore, not a system restore. That works because bothcheckpointandrestoremerely shuffle metadata around." The ~1-second restore number from the launch post is implementation-explained: immutable content-addressed chunks + cheap metadata clone = O(metadata) restore. See concepts/fast-checkpoint-via-metadata-shuffle and patterns/checkpoint-as-metadata-clone. "(our pre-installed Claude Code will checkpoint aggressively for you without asking)." -
Inside-out orchestration. "In the cloud hosting industry, user applications are managed by two separate, yet equally important components: the host, which orchestrates workloads, and the guest, which runs them. Sprites flip that on its head: the most important orchestration and management work happens inside the VM." "We've slid a container between you and the kernel. You see an inner environment, managed by a fleet of services running in the root namespace of the VM." The root namespace hosts the storage stack, service manager, log pipeline, and port-forwarding proxy. "When you talk to the global API, chances are you're talking directly to your own VM." Canonical wiki statement of concepts/inside-out-orchestration and concepts/inner-container-vm. The bounce without reboot property: "The inner container allows us to bounce a Sprite without rebooting the whole VM, even on checkpoint restores."
-
Blast-radius argument for inside-out. "Changes to Sprites don't restart host components or muck with global state. The blast radius is just new VMs that pick up the change. We sleep on how much platform work doesn't get done not because the code is hard to write, but because it's so time-consuming to ensure benign-looking changes don't throw the whole fleet into metastable failure." The platform-team-velocity argument for inside-out orchestration. See patterns/blast-radius-in-vm-not-host. Complements the 2025-10-22 Corrosion post's regionalization argument: both are about shrinking the unit of change-correlation.
-
Orchestrator is Phoenix on object storage, per-account SQLite via Litestream. "The global orchestrator for Sprites is an Elixir/Phoenix app that uses object storage as the primary source of metadata for accounts. We then give each account an independent SQLite database, again made durable on object storage with Litestream." Phoenix appears here as control-plane substrate, not product veneer. The per-account SQLite pattern extends Litestream's wildcard-replication story from [[sources/2025-05-20-flyio- litestream-revamped]] into a multi-tenant orchestrator.
-
Sprites keep Fly.io's existing Anycast + Corrosion substrate. "Sprites plug directly into Corrosion, our gossip-based service discovery system. When you ask the Sprite API to make a public URL for your Sprite, we generate a Corrosion update that propagates across our fleet instantly. Your application is then served, with an HTTPS URL, from our proxy edges." The Anycast +
*.sprites.devURL feature from the launch post reuses the same state-distribution system that under- pins Fly Machines routing. No new wiki primitive needed on this axis. -
Sprites are a contract, not a substrate. "Sprites are a contract with user code: an API and a set of expectations about how the execution environment works. Today, they run on top of Fly Machines. But they don't have to. Jerome's working on an open-source local Sprite runtime. We'll find other places to run them, too." The substrate is swappable; the contract is the product. Parallels the Macaroon-design framing from [[sources/2025-03-27-flyio-operationalizing- macaroons|operationalizing Macaroons]] ("a token format is a contract").
-
Explicit positioning vs. Fly Machines. "Sprites live alongside Fly Machines in our architecture. They include some changes that are pure wins, but they're mostly tradeoffs." Named tradeoffs: (a) "We've always wanted to run Fly Machine disks off object storage ( LSVD experimental feature), but the performance isn't adequate for a hot Postgres node in production." (b) "Professional production apps ship out of CI/CD systems as OCI containers; that's a big part of what makes orchestrating Fly Machines so hard." (c) "Most (though not all) Sprite usage is interactive, and Sprite users benefit from their VMs aggressively sleeping themselves to keep costs low; e-commerce apps measure responsiveness in milliseconds and want their workloads kept warm." Prescribes a prototype-on-Sprites, ship-on-Machines workflow: "An automated workflow for that will happen."
Operational numbers¶
- Create latency: "a second or two" (reaffirms the launch-post 1-2 s claim, now explained as warm-pool dequeue).
- Per-Sprite durable storage: 100 GB default.
- Billing: "only for storage blocks you actually write, not the full 100GB capacity" — thin-provisioned storage charged on write. Complements the idle-auto-sleep metering from the launch post.
- Checkpoint create / restore: both framed as O(metadata) — "merely shuffle metadata around"; exact number not restated but the launch post's "completes instantly" / "~1 second restore" align.
- No numbers disclosed for: NVMe-cache hit rate, read / write amplification on miss, checkpoint storage size, metadata DB size per account, Corrosion-update-propagation time for a new Anycast URL, pool depth per worker, pool re-fill lead time.
Systems extracted¶
- systems/fly-sprites — implementation layer added (warm-pool-create, JuiceFS-derived storage stack, inside-out orchestration).
- systems/fly-machines — contrasted substrate; extended with Sprites-as-sibling framing.
- systems/juicefs — new page (JuiceFS metadata+chunk model; Fly.io's hacked-up fork with SQLite metadata).
- systems/litestream — extended: Sprite storage metadata DB
- per-account orchestrator metadata DB.
- systems/corrosion-swim — extended: Sprite Anycast URL propagation.
- systems/sqlite — extended: Sprite storage metadata + per- account orchestrator metadata.
- systems/aws-s3 — extended: Sprite disk-root substrate.
- systems/phoenix-framework — extended: Sprites orchestrator.
- systems/lsvd — extended: named as the disk-off-object- storage-for-Fly-Machines experiment whose perf ceiling Sprites route around.
- systems/fly-proxy — extended: Sprites Anycast URL serving.
Concepts extracted¶
- concepts/no-container-image-sprite — new — canonical wiki statement of the "murder user containers" design.
- concepts/warm-sprite-pool — new — warm pool of pre-
created Sprites on every worker;
create= dequeue. - concepts/object-storage-as-disk-root — new — disk substrate is S3, not local NVMe.
- concepts/read-through-nvme-cache — new — sparse NVMe volume as dm-cache-style read-through cache over immutable object-store chunks.
- concepts/metadata-data-split-storage — new — JuiceFS- lineage metadata/chunk split with SQLite+Litestream in the metadata slot.
- concepts/inside-out-orchestration — new — most orchestration code inside the VM's root namespace; user in an inner container.
- concepts/inner-container-vm — new — the Linux container slid between the user and the VM kernel, letting the VM bounce user code without rebooting the VM.
- concepts/fast-checkpoint-via-metadata-shuffle — new — checkpoint/restore reduces to a metadata clone because data chunks are immutable.
- concepts/durable-state-as-url — new — workload state is a URL; migration and recovery are "mount elsewhere".
- concepts/fleet-drain-operation — new — the platform-operator capability Fly Volumes broke and object- store-rooted disks restore.
- concepts/fast-vm-boot-dx — extended: warm-pool-create path as the implementation arm of the fast-boot DX promise.
- concepts/scale-to-zero — extended: Sprites' auto-sleep-while-idle shape made economically viable by object-store-rooted disks (no attached-storage anchoring cost).
Patterns extracted¶
- patterns/warm-pool-zero-create-path — new — canonical pattern for instant-create-by-dequeue.
- patterns/read-through-object-store-volume — new — user-facing block device backed by immutable object-store chunks with local NVMe read-through cache.
- patterns/metadata-plus-chunk-storage-stack — new — JuiceFS / Tigris / Sprites-storage meta-pattern: small transactional metadata tier + immutable byte/chunk tier.
- patterns/inside-out-vm-orchestration — new — orchestration services run in the VM root namespace, not on the host; host is dumb scheduler of VMs.
- patterns/checkpoint-as-metadata-clone — new — checkpoint = snapshot of the metadata DB; restore = re-point to that snapshot.
- patterns/blast-radius-in-vm-not-host — new — platform-change deploys go out as new VMs picking up the change, not as live host-component restarts.
- patterns/metadata-db-plus-object-cache-tier — extended: third canonical wiki instance (after Tigris + generic).
- patterns/checkpoint-backup-to-object-storage — extended: per-account SQLite orchestrator DBs backed up via Litestream to object storage.
- patterns/async-block-clone-for-stateful-migration — extended: the 3-year-to-ship feature this post explicitly routes around for Sprites ("Imagine losing a capability like that").
Caveats¶
- Architecture-deferral. Ptacek: "I could easily do another 1500-2000 words here on the Cronenberg film Kurt came up with for the actual storage stack, but because it's in flux, let's keep it simple." Explicit signal that the storage-stack implementation is actively evolving; claims here may be superseded by a follow-up.
- No disclosed numbers for: warm-pool depth per worker, pool-refill lead time, NVMe cache hit rate, cache-miss read amplification, storage-stack write path latency, checkpoint restore wall-clock under load, Corrosion-update propagation time for Sprite URL creation, per-account SQLite DB size ceiling, metadata-DB IO ceiling per worker.
- Under the hood, Sprites are still Fly Machines today. "Now, today, under the hood, Sprites are still Fly Machines." The inside-out story is a Fly-Machine-tenant architecture; the promised local-runtime swap ("Jerome's working on…") is not shipped.
- Kernel-isolation primitive still unnamed. Launch-post caveat unchanged — the post doesn't say whether the VM primitive is Firecracker, Intel Cloud Hypervisor, or something else. "Sprites are still Fly Machines" implies the existing Fly Machines KVM / micro-VM layer, but isn't explicitly stated.
- dm-cache-like, not dm-cache. "Our stack sports a
dm-cache-like feature" — the cache stack is custom,
not a straight
dm-cachedeployment. Eviction / write-policy not disclosed. - No threat-model disclosure. The inside-out model moves orchestration code into the VM alongside user code; the post sketches the mechanism ("We've slid a container between you and the kernel") but doesn't enumerate the attack surface (container escape, namespace-privilege leaks, root-namespace-service authentication between inner and outer).
- Tigris not named as the S3-compatible object store, though Fly.io's regional Tigris is the obvious candidate; S3-compatibility is the only property named.
- Prototype-then-ship workflow is aspirational. "An automated workflow for that will happen." Not shipped.
Source¶
- Original: https://fly.io/blog/design-and-implementation/
- Raw markdown:
raw/flyio/2026-01-14-the-design-implementation-of-sprites-b747d372.md
Related¶
- sources/2026-01-09-flyio-code-and-let-live — direct predecessor, 5 days earlier — the thesis / launch. This post is the implementation companion.
- systems/fly-sprites — canonical system page.
- systems/fly-machines — contrasted substrate.
- systems/juicefs — metadata+chunk storage model Sprites fork.
- systems/litestream — metadata-DB durability substrate.
- systems/corrosion-swim — Anycast URL propagation.
- systems/sqlite — metadata-DB form factor.
- systems/aws-s3 — S3-compatible origin tier.
- systems/phoenix-framework — orchestrator substrate.
- systems/lsvd — named predecessor experiment (Fly-Machine disks on object storage; performance ceiling).
- concepts/no-container-image-sprite
- concepts/warm-sprite-pool
- concepts/object-storage-as-disk-root
- concepts/read-through-nvme-cache
- concepts/metadata-data-split-storage
- concepts/inside-out-orchestration
- concepts/inner-container-vm
- concepts/fast-checkpoint-via-metadata-shuffle
- concepts/durable-state-as-url
- concepts/fleet-drain-operation
- patterns/warm-pool-zero-create-path
- patterns/read-through-object-store-volume
- patterns/metadata-plus-chunk-storage-stack
- patterns/inside-out-vm-orchestration
- patterns/checkpoint-as-metadata-clone
- patterns/blast-radius-in-vm-not-host
- patterns/metadata-db-plus-object-cache-tier
- companies/flyio