CONCEPT Cited by 1 source

Stateful incremental VM build¶

Definition¶

A VM lifecycle where the running-VM filesystem is the unit of build, mutated iteratively after boot — packages installed, source files edited, systemd units added — rather than the container-image-before-boot unit where all such changes are baked into an immutable OCI image. LLM-driven coding agents overwhelmingly favour this shape: they boot a Machine from a minimal base image and then "build up" the environment through trial-and-error tool calls.

Canonical wiki statement¶

Fly.io, 2025-04-08:

Another weird thing that robot workflows do is to build Fly Machines up incrementally. This feels really wrong to us. Until we discovered our robot infestation, we'd have told you not to do to this. Ope!

A typical vibe coding session boots up a Fly Machine out of some minimal base image, and then, once running, adds packages, edits source code, and adds systemd units (robots understand systemd; it's how they're going to replace us). This is antithetical to normal container workflows, where all this kind of stuff is baked into an immutable static OCI container. But that's not how LLMs work: the whole process of building with an LLM is stateful trial-and-error iteration.

(Source: sources/2025-04-08-flyio-our-best-customers-are-now-robots)

Why LLMs do it this way¶

The build model an LLM can execute has to match the feedback signal the LLM can observe:

Immutable-image builds need a separate diagnosis loop. An OCI-image build fails at image-build time; the LLM has to read the build log, guess the fix, re-build. Build latency is minutes per iteration (image rebuild, push, pull, run).
Incremental-in-VM builds fail at the step that failed. The LLM does apt install …, reads stderr, fixes the package name, re-runs. Iteration latency is seconds. Each successful step accumulates in the live filesystem.

The combination — fast per-step feedback + accumulated state on the disk between steps — is what lets the LLM converge on a working Machine through trial-and-error. It's the agentic development loop applied to machine construction, not just code.

Why Fly.io initially told people not to do this¶

Immutable-container dogma is deliberate: reproducibility, rollback, dense packing, cache friendliness, blast-radius containment. A running-VM filesystem drifts. Two Machines booted from the same base image + same build script land in nearly the same state but never exactly. For a fleet of human-built services this is an operational problem — failures don't reproduce.

Fly.io's retrospective framing — "This feels really wrong to us. Until we discovered our robot infestation, we'd have told you not to do to this" — is a concession that the shape the LLM workload wants is the shape Fly.io recommended against. The concession is load-bearing because it forces the platform to commit first-class to filesystem storage (Fly Volumes) even though Fly says "the one form of storage we sort of wish we hadn't done."

Storage primitive consequences¶

The shape drives storage-primitive demand:

Not Postgres. The dogma answer for humans ("just give people Postgres") misses. LLMs don't want a structured store; they want file-shape storage they can apt install into.
Filesystem. Fly Volumes (local NVMe, attached to one worker, bus-hop from the Machine).
Object storage. For larger / shareable artefacts — datasets, weights, build caches — that don't belong on the per-Machine volume. Tigris is Fly's answer here.

Fly's summary:

As product thinkers, our intuition about storage is "just give people Postgres". And that's the right answer, most of the time, for humans. But because LLMs are doing the Cursed and Defiled Root Chalice Dungeon version of app construction, what they really need is a filesystem, the one form of storage we sort of wish we hadn't done. That, and object storage.

(Source: sources/2025-04-08-flyio-our-best-customers-are-now-robots)

Pause / resume shape¶

Because the filesystem is the build artefact, the start/stop lifecycle matters: stop preserves the in-progress build state on the worker's NVMe; start resumes with that state intact. The LLM can pause for hours between bursts and resume where it left off. create-from-image-every-time would erase the build every pause.

Operational trade-offs¶

Plus: fast iteration. LLM loop converges orders of magnitude faster than rebuild-image-per-iteration.
Plus: no per-step push overhead. No registry round-trips per failed attempt.
Minus: not reproducible. Two LLMs iterating from the same base image land in different places.
Minus: not easily re-deployable. The output is a live Machine's filesystem, not a container image. To redeploy somewhere else, either snapshot the Machine or reconstruct the build as a Dockerfile / ansible / etc.
Minus: filesystem drift. Orphan files accumulate from failed-then-abandoned tool calls; the final filesystem has vestigial state from iteration history.

The post does not engage with the minuses; it's bullish on the shape because the shape is the workload.

Open questions¶

Snapshotting. At what point (if ever) does the LLM's Machine get snapshotted into a clean image for deployment? The post doesn't say. The implied answer may be "never; deploy the Machine itself."
Determinism. If the LLM's iteration pattern includes randomness (different tool picks, different package versions), the build is non-deterministic. No engagement with how to reproduce a session's final state.
Rollback. Fly Volumes don't snapshot automatically; what happens when iteration step N+1 destroys something that was working at step N? The post doesn't address this.

Seen in¶

sources/2025-04-08-flyio-our-best-customers-are-now-robots — Fly.io's 2025-04-08 post names stateful incremental VM build as the storage-shape claim behind robots-as-customers; canonical wiki datum.

concepts/vibe-coding — the session shape this is the storage consequence of.
concepts/fly-machine-start-vs-create — the lifecycle primitive that makes the pause/resume pattern possible.
concepts/robot-experience-rx — the product-design axis this is an RX data point on.
concepts/agentic-development-loop — the loop structure this storage shape serves.
systems/fly-volumes — the filesystem primitive this shape demands.
systems/fly-machines — the compute primitive whose filesystem is the unit of build.
companies/flyio — canonical wiki source.