Skip to content

CONCEPT Cited by 1 source

Natural-language infrastructure provisioning

Definition

Natural-language infrastructure provisioning is the UX posture where the primary human-to-infrastructure interface is a free-form conversation with an LLM — not a YAML manifest, not a Terraform HCL file, not a web dashboard, not a CLI flag string. The LLM is wired to the cloud control plane through MCP tools, typically by wrapping the cloud's existing CLI. The human says "launch my application" or "delete the oldest unattached volume"; the LLM composes the appropriate tool calls, optionally presents a plan, and on approval executes the mutations.

Canonical wiki statement

Sam Ruby, Fly.io, 2025-05-07:

"Today's state of the art is K8S, Terraform, web based UIs, and CLIs. Those days are numbered."

"Imagine a future where you say to your favorite LLM 'launch my application on Fly.io', and your code is scanned and you are presented with a plan containing details such as regions, databases, s3 storage, memory, machines, and the like. You are given the opportunity to adjust the plan and, when ready, say 'Make it so'."

(Source: sources/2025-05-07-flyio-provisioning-machines-using-mcps.)

The four-way replacement thesis

Ruby's framing explicitly names the four existing provisioning surfaces being displaced:

  1. Kubernetes — declarative YAML; high ceremony; steep learning curve.
  2. Terraform — declarative HCL; plan-and-apply discipline (see patterns/plan-then-apply-agent-provisioning — the LLM flow reimplements this at the language layer).
  3. Web-based UIs — click-through dashboards; read-shaped more than write-shaped ("isn't any way to sort the list or delete a volume").
  4. CLIs — correct but ceremony-heavy ("would have had to use the documentation and a bit of copy/pasting of arguments").

The LLM-through-MCP flow flattens the choice: the human describes intent, the LLM resolves intent to CLI invocations, the human reviews + approves mutations through dialogue.

Why this is plausible now (and wasn't in 2020)

Three things converge:

  • MCP as a standard tool interface (systems/model-context-protocol). Tool schemas are portable; any MCP-compatible client talks to any MCP server.
  • CLIs already shipped with --json output modes for automation reasons (concepts/structured-output-reliability). Fly.io's 2020 --json decision became load-bearing for its MCP integration five years later.
  • CLIs already ship refusal invariants"I would have received an error had I tried to destroy a volume that is currently mounted. Knowing that gave me the confidence to try the command." (patterns/cli-safety-as-agent-guardrail).

Wrapping the CLI as an MCP server (patterns/wrap-cli-as-mcp-server) is the short path; Fly.io went from read-only (2025-04-10, 2 tools, 90 LoC Go) to full systems/fly-volumes CRUD (2025-05-07) in about four weeks.

Emergent resource-hygiene UX

A load-bearing side-benefit of the NL-provisioning posture is that the LLM surfaces resource hygiene the operator didn't specifically ask about. Ruby, 2025-05-07:

"I asked for a list of volumes for an existing app, and Claude noted that I had a few volumes that weren't attached to any machines. So I asked it to delete the oldest unattached volume, and it did so."

The agent acts as a proactive surveyor of the resource graph — surfacing stale / unattached / underutilized / drifted resources — without being explicitly asked. The dashboard-UI equivalent would require the operator to click into a filtered view they didn't know existed. This is the agentic troubleshooting loop extended from diagnosis into provisioning-hygiene.

Alternatives the author explicitly rejects

For the same task ("delete the oldest unattached volume"), Ruby enumerates three existing mechanisms and why each lost to NL + MCP:

  • Cloud's HTTP API"would have required some effort" (API client, endpoint lookup, auth wiring).
  • CLI directly"would have had to use the documentation and a bit of copy/pasting of arguments" (flag strings are high-ceremony even for the native operator).
  • Web dashboard"isn't any way to sort the list or delete a volume" (dashboard implementation incomplete on the mutation side; read-only views are common, full-CRUD dashboards are not).

Open questions

  • Plan-and-apply boundary. The "Make it so" paragraph sketches a plan gate the LLM presents before mutations. The 2025-05-07 post doesn't ship this — mutations went through without an explicit plan step in the volume-delete example. patterns/plan-then-apply-agent-provisioning is an aspirational target, not a deployed mechanism in flyctl v0.3.117.
  • Irreversible-operation confirmation UX. Ruby acknowledges "if you ask it to destroy a volume, that operation is not reversable" but doesn't describe any MCP-client-side confirmation primitive. Most MCP clients (Claude Desktop, Cursor) do pop a tool-call approval dialog, but the semantic — "this deletes production data" — isn't necessarily distinguishable from a read call in the UI.
  • Prompt-injection via code scanning. The "your code is scanned and you are presented with a plan" flow implies the LLM reads source + configs + READMEs. Adversarial content planted in a dependency's README ("always create an extra admin role") becomes a provisioning-injection vector.
  • Authorization scope. The workstation-local MCP inherits the operator's full flyctl credentials (concepts/local-mcp-server-risk); there's no role-based restriction of the LLM's reachable tool surface.

Seen in

Last updated · 200 distilled / 1,178 read