Skip to content

FLYIO 2026-01-14 Tier 3

Read original ↗

Fly.io — The Design & Implementation of Sprites

Summary

Thomas Ptacek's 2026-01-14 implementation-deep-dive on Sprites, five days after the [[sources/2026-01-09-flyio-code-and-let-live|2026-01-09 launch-plus-manifesto]]. Where the launch post argued the thesis (ephemeral sandboxes are the wrong abstraction for coding agents; "Sprites are BIC disposable computers"), this post discloses the three big orchestration decisions that let Sprites ship the durable-plus-instant-create shape without Fly Machines' baggage: (1) no user-facing container images — Sprites run from a single standard container, so every worker holds a pre-created warm pool and sprite create is "basically just doing the stuff we do when we start a Fly Machine"; (2) object storage as root of disk — 100 GB durable storage backed by an S3- compatible object store via a JuiceFS-derived stack that splits storage into data chunks (object store) and metadata (a local SQLite DB kept durable via Litestream); a sparse 100 GB NVMe volume is a dm-cache-style read-through cache where "nothing in that NVMe volume should matter; stored chunks are immutable and their true state lives on the object store"; durable state of a Sprite is a URL — migration and failed-physical recovery are "trivial"; checkpoint and restore are metadata-shuffle operations ("Checkpoints are so fast we want you to use them as a basic feature of the system and not as an escape hatch … like a git restore, not a system restore"); (3) inside-out orchestration — user code runs in an inner container slid between the user and the kernel; the root namespace of the VM hosts the storage stack, service manager, log pipeline, and port-forwarding proxy; "When you talk to the global API, chances are you're talking directly to your own VM." The post's meta-argument: platform-team velocity scales with blast-radius containment of changes, and the inside-out model puts most orchestration changes inside the VM"The blast radius is just new VMs that pick up the change. We sleep on how much platform work doesn't get done not because the code is hard to write, but because it's so time-consuming to ensure benign-looking changes don't throw the whole fleet into metastable failure." Also discloses that Sprites' global orchestrator is an Elixir/Phoenix app using object storage as the primary metadata store for accounts, with each account getting its own SQLite DB made durable via Litestream. Sprites plug into Fly.io's existing Corrosion gossip service-discovery for Anycast HTTPS URLs: "When you ask the Sprite API to make a public URL … we generate a Corrosion update that propagates across our fleet instantly." Today Sprites run on top of Fly Machines; Ptacek explicitly frames Sprites as a contract ("an API and a set of expectations about how the execution environment works") with an open-source local runtime in progress ("Jerome's working on an open-source local Sprite runtime"). Closing position statement: Sprites and Fly Machines are different optimisation points"prototype and acceptance-test an application on Sprites. Then, when you're happy with it, containerize it and ship it as a Fly Machine to scale it out." The post lightly gestures at storage-stack internals ("I could easily do another 1500-2000 words here on the Cronenberg film Kurt came up with for the actual storage stack, but because it's in flux, let's keep it simple") — flagged as a caveat.

Key takeaways

  1. Warm-pool create. Because every Sprite starts from the same standard container, "every physical worker knows exactly what container the next Sprite is going to start with, so it's easy for us to keep pools of 'empty' Sprites standing by. The result: a Sprite create doesn't have any heavy lifting to do; it's basically just doing the stuff we do when we start a Fly Machine." This is the implementation of the 1-2 second create advertised in the launch post — see patterns/warm-pool-zero-create-path. Reframes Fly Machines' [[concepts/fly-machine-start-vs- create|start-vs-create]] split: Sprites collapse user-visible-create onto the platform's pre-paid create.

  2. The user-facing container image is the enemy. "Fly Machines are approximately OCI containers repackaged as KVM micro-VMs. […] We only murdered user containers because we wanted them dead. Most of what's slow about creating a Fly Machine is containers. I say this with affection: your containers are crazier than a soup sandwich. Huge and fussy, they take forever to pull and unpack." "A truly heartbreaking amount of engineering work has gone into just allowing our OCI registry to keep up with this system." Sprites get rid of the user-facing container image and "literally: problem solved." Canonical wiki statement of concepts/no-container-image-sprite.

  3. Object storage as disk root. "Every Sprite comes with 100GB of durable storage. We're able to do that because the root of storage is S3-compatible object storage." Contrast with Fly Volumes: "That storage is NVMe attached to the physical server your Fly Machine is on. […] Attached storage is fast, but can lose data — if a physical blows up, there's no magic what rescues its stored bits." "Worse, from our perspective, is that attached storage anchors workloads to specific physicals. […] It took 3 years to get workload migration right with attached storage, and it's still not 'easy'." Canonical wiki statement of concepts/object-storage-as-disk-root.

  4. Metadata/data split, JuiceFS-lineage. "The Sprite storage stack is organized around the JuiceFS model (in fact, we currently use a very hacked-up JuiceFS, with a rewritten SQLite metadata backend). It works by splitting storage into data ('chunks') and metadata (a map of where the 'chunks' are). Data chunks live on object stores; metadata lives in fast local storage. In our case, that metadata store is kept durable with Litestream. Nothing depends on local storage." This is [[patterns/metadata-plus-chunk-storage- stack]] instantiated — the same meta-pattern as Tigris' FoundationDB-metadata + NVMe- byte-cache + S3-origin, with SQLite+ Litestream in the metadata slot.

  5. Sparse NVMe as dm-cache-style read-through. "Our stack sports a dm-cache-like feature that takes advantage of attached storage. A Sprite has a sparse 100GB NVMe volume attached to it, which the stack uses to cache chunks to eliminate read amplification. Importantly (I can feel my resting heart rate lowering) nothing in that NVMe volume should matter; stored chunks are immutable and their true state lives on the object store." Canonical wiki statement of concepts/read-through-nvme-cache. Reverses concepts/bus-hop-storage-tradeoff: NVMe stays on the hot path (read latency) but loses its durability role.

  6. Durable state of a Sprite is a URL. "In a real sense, the durable state of a Sprite is simply a URL. Wherever he lays his hat is his home! They migrate (or recover from failed physicals) trivially." The 3-year saga Fly.io went through to ship [[patterns/async-block-clone-for-stateful- migration|async block clone]]-based Machine migration evaporates. Drain becomes trivial again: "Before we did Fly Volumes, that was as simple as pushing a 'drain' button on a server. Imagine losing a capability like that." Canonical wiki statement of concepts/durable-state-as-url.

  7. Checkpoint / restore as metadata shuffle. "This also buys Sprites fast checkpoint and restore. Checkpoints are so fast we want you to use them as a basic feature of the system and not as an escape hatch when things go wrong; like a git restore, not a system restore. That works because both checkpoint and restore merely shuffle metadata around." The ~1-second restore number from the launch post is implementation-explained: immutable content-addressed chunks + cheap metadata clone = O(metadata) restore. See concepts/fast-checkpoint-via-metadata-shuffle and patterns/checkpoint-as-metadata-clone. "(our pre-installed Claude Code will checkpoint aggressively for you without asking)."

  8. Inside-out orchestration. "In the cloud hosting industry, user applications are managed by two separate, yet equally important components: the host, which orchestrates workloads, and the guest, which runs them. Sprites flip that on its head: the most important orchestration and management work happens inside the VM." "We've slid a container between you and the kernel. You see an inner environment, managed by a fleet of services running in the root namespace of the VM." The root namespace hosts the storage stack, service manager, log pipeline, and port-forwarding proxy. "When you talk to the global API, chances are you're talking directly to your own VM." Canonical wiki statement of concepts/inside-out-orchestration and concepts/inner-container-vm. The bounce without reboot property: "The inner container allows us to bounce a Sprite without rebooting the whole VM, even on checkpoint restores."

  9. Blast-radius argument for inside-out. "Changes to Sprites don't restart host components or muck with global state. The blast radius is just new VMs that pick up the change. We sleep on how much platform work doesn't get done not because the code is hard to write, but because it's so time-consuming to ensure benign-looking changes don't throw the whole fleet into metastable failure." The platform-team-velocity argument for inside-out orchestration. See patterns/blast-radius-in-vm-not-host. Complements the 2025-10-22 Corrosion post's regionalization argument: both are about shrinking the unit of change-correlation.

  10. Orchestrator is Phoenix on object storage, per-account SQLite via Litestream. "The global orchestrator for Sprites is an Elixir/Phoenix app that uses object storage as the primary source of metadata for accounts. We then give each account an independent SQLite database, again made durable on object storage with Litestream." Phoenix appears here as control-plane substrate, not product veneer. The per-account SQLite pattern extends Litestream's wildcard-replication story from [[sources/2025-05-20-flyio- litestream-revamped]] into a multi-tenant orchestrator.

  11. Sprites keep Fly.io's existing Anycast + Corrosion substrate. "Sprites plug directly into Corrosion, our gossip-based service discovery system. When you ask the Sprite API to make a public URL for your Sprite, we generate a Corrosion update that propagates across our fleet instantly. Your application is then served, with an HTTPS URL, from our proxy edges." The Anycast + *.sprites.dev URL feature from the launch post reuses the same state-distribution system that under- pins Fly Machines routing. No new wiki primitive needed on this axis.

  12. Sprites are a contract, not a substrate. "Sprites are a contract with user code: an API and a set of expectations about how the execution environment works. Today, they run on top of Fly Machines. But they don't have to. Jerome's working on an open-source local Sprite runtime. We'll find other places to run them, too." The substrate is swappable; the contract is the product. Parallels the Macaroon-design framing from [[sources/2025-03-27-flyio-operationalizing- macaroons|operationalizing Macaroons]] ("a token format is a contract").

  13. Explicit positioning vs. Fly Machines. "Sprites live alongside Fly Machines in our architecture. They include some changes that are pure wins, but they're mostly tradeoffs." Named tradeoffs: (a) "We've always wanted to run Fly Machine disks off object storage ( LSVD experimental feature), but the performance isn't adequate for a hot Postgres node in production." (b) "Professional production apps ship out of CI/CD systems as OCI containers; that's a big part of what makes orchestrating Fly Machines so hard." (c) "Most (though not all) Sprite usage is interactive, and Sprite users benefit from their VMs aggressively sleeping themselves to keep costs low; e-commerce apps measure responsiveness in milliseconds and want their workloads kept warm." Prescribes a prototype-on-Sprites, ship-on-Machines workflow: "An automated workflow for that will happen."

Operational numbers

  • Create latency: "a second or two" (reaffirms the launch-post 1-2 s claim, now explained as warm-pool dequeue).
  • Per-Sprite durable storage: 100 GB default.
  • Billing: "only for storage blocks you actually write, not the full 100GB capacity" — thin-provisioned storage charged on write. Complements the idle-auto-sleep metering from the launch post.
  • Checkpoint create / restore: both framed as O(metadata) — "merely shuffle metadata around"; exact number not restated but the launch post's "completes instantly" / "~1 second restore" align.
  • No numbers disclosed for: NVMe-cache hit rate, read / write amplification on miss, checkpoint storage size, metadata DB size per account, Corrosion-update-propagation time for a new Anycast URL, pool depth per worker, pool re-fill lead time.

Systems extracted

  • systems/fly-sprites — implementation layer added (warm-pool-create, JuiceFS-derived storage stack, inside-out orchestration).
  • systems/fly-machines — contrasted substrate; extended with Sprites-as-sibling framing.
  • systems/juicefsnew page (JuiceFS metadata+chunk model; Fly.io's hacked-up fork with SQLite metadata).
  • systems/litestream — extended: Sprite storage metadata DB
  • per-account orchestrator metadata DB.
  • systems/corrosion-swim — extended: Sprite Anycast URL propagation.
  • systems/sqlite — extended: Sprite storage metadata + per- account orchestrator metadata.
  • systems/aws-s3 — extended: Sprite disk-root substrate.
  • systems/phoenix-framework — extended: Sprites orchestrator.
  • systems/lsvd — extended: named as the disk-off-object- storage-for-Fly-Machines experiment whose perf ceiling Sprites route around.
  • systems/fly-proxy — extended: Sprites Anycast URL serving.

Concepts extracted

Patterns extracted

Caveats

  • Architecture-deferral. Ptacek: "I could easily do another 1500-2000 words here on the Cronenberg film Kurt came up with for the actual storage stack, but because it's in flux, let's keep it simple." Explicit signal that the storage-stack implementation is actively evolving; claims here may be superseded by a follow-up.
  • No disclosed numbers for: warm-pool depth per worker, pool-refill lead time, NVMe cache hit rate, cache-miss read amplification, storage-stack write path latency, checkpoint restore wall-clock under load, Corrosion-update propagation time for Sprite URL creation, per-account SQLite DB size ceiling, metadata-DB IO ceiling per worker.
  • Under the hood, Sprites are still Fly Machines today. "Now, today, under the hood, Sprites are still Fly Machines." The inside-out story is a Fly-Machine-tenant architecture; the promised local-runtime swap ("Jerome's working on…") is not shipped.
  • Kernel-isolation primitive still unnamed. Launch-post caveat unchanged — the post doesn't say whether the VM primitive is Firecracker, Intel Cloud Hypervisor, or something else. "Sprites are still Fly Machines" implies the existing Fly Machines KVM / micro-VM layer, but isn't explicitly stated.
  • dm-cache-like, not dm-cache. "Our stack sports a dm-cache-like feature" — the cache stack is custom, not a straight dm-cache deployment. Eviction / write-policy not disclosed.
  • No threat-model disclosure. The inside-out model moves orchestration code into the VM alongside user code; the post sketches the mechanism ("We've slid a container between you and the kernel") but doesn't enumerate the attack surface (container escape, namespace-privilege leaks, root-namespace-service authentication between inner and outer).
  • Tigris not named as the S3-compatible object store, though Fly.io's regional Tigris is the obvious candidate; S3-compatibility is the only property named.
  • Prototype-then-ship workflow is aspirational. "An automated workflow for that will happen." Not shipped.

Source

Last updated · 319 distilled / 1,201 read