PATTERN Cited by 2 sources
Wrap CLI as MCP server¶
Pattern¶
Expose an existing CLI as an LLM tool surface by writing a thin MCP server that:
- picks a small subset of CLI subcommands (typically read-only) and registers each as an MCP tool,
- invokes the CLI in a subprocess per tool call, passing through any LLM-supplied arguments,
- captures stdout (usually with
--jsonor equivalent structured-output flag — see concepts/structured-output-reliability) and returns it verbatim as the MCP tool response, - uses MCP
stdiotransport so the MCP server itself is launched by the client (Claude Desktop, Claude Code, Cursor, Goose, …) as a subprocess, no HTTP/auth layer required, - inherits the operator's existing CLI credentials —
whatever's already in
~/.config/<cli>/ env / the CLI's login state becomes the auth boundary.
The canonical wiki instance is Fly.io's flymcp
— "the most basic MCP server for flyctl", ~90 lines of Go
(using mark3labs/mcp-go),
built in ~30 minutes, exposing exactly two flyctl
subcommands: fly logs + fly status. (Source:
sources/2025-04-10-flyio-30-minutes-with-mcp-and-flyctl.)
Why it's viable¶
Three things converge to make this pattern a one-afternoon move rather than a quarter-long project:
- Mature MCP libraries exist per language. mark3labs/mcp-go, Python MCP SDK, TypeScript MCP SDK — each handles the protocol plumbing so the CLI-wrap author just declares tools + handlers.
- Stdio transport eliminates the distributed-systems burden. No session affinity (concepts/mcp-long-lived-sse), no auth token exchange, no rate-limiting tier, no multitenancy. The operator's desktop MCP client launches the wrapper as a subprocess; when the client exits, the wrapper exits.
- JSON-mode-as-already-done. If the CLI has a
--jsonor-o jsonflag, the structured-output problem is solved. See concepts/structured-output-reliability. Fly.io's 2020 decision to give mostflyctlcommands a--jsonmode was load-bearing for flymcp five years later — "I don't know how much of a difference it made" (it made all the difference). - LLM planners compose read-only observability primitives
well. Two tools (
logs+status) turned out to be enough to reproduce an experienced SRE's incident-diagnosis flow against a Fly-hosted CDN app.
Canonical instance — flymcp + unpkg¶
Claude, given only fly logs and fly status as tools, produced
without further prompting:
- the global topology of unpkg (10 Fly Machines across 11
regions:
lax,atl,ewr,lhr,cdg,ams,sin,nrt,hkg,bog,syd), - criticality classification of 2 machines in non-healthy status ("context deadline exceeded", "gone"),
- oom_killed event correlation across multiple machines,
- on the follow-up prompt "try getting logs for one of the critical machines", a per-second incident timeline from OOM kill → SIGKILL → reboot → health-check fail → listener up → health-check pass, ~43 seconds end-to-end,
- the specific kernel OOM line with RSS + process numbers: "Out of memory: Killed process 641 (bun) total-vm:85950964kB, anon-rss:3744352kB, …" and the memory ceiling diagnosis: Bun at ~3.7 GB of 4 GB allocated.
Ptacek's read: "annoyingly useful … faster than I find problems in apps." See concepts/agentic-troubleshooting-loop for the planner-executor loop shape this instantiates.
Pattern elements¶
- Tool picker. Choose 1–5 read-only subcommands initially. Fewer tools = more accurate LLM tool selection (patterns/tool-surface-minimization) and smaller context-window footprint.
- Read-only posture. Gate mutations behind a second tier (patterns/allowlisted-read-only-agent-actions) or leave them out entirely for v1. Blast radius of LLM hallucination should be bounded to "wrong conclusion", not "destroyed machine".
- Structured-output flag.
--json/-o json/--format json. The wrapper should pass this flag unconditionally; the LLM never sees pretty-printed human tables. - Subprocess-per-call. No need for a long-running CLI daemon; spawn fresh per tool invocation. Keeps the wrapper stateless and easy to reason about.
- Pass-through credentials. Don't reinvent auth. The
operator already ran
flyctl auth login/aws configure/gcloud auth/kubectl config; the wrapper inherits it by inheriting env and~/.config.
Generalisation¶
The pattern clearly extends beyond flyctl. Any CLI with
--json mode is a candidate: kubectl, aws, gcloud, gh,
doctl, linode-cli, heroku, pulumi, terraform, fastly,
netlify, vercel. Fly.io doesn't claim generality in the
post, but the 90-LoC-Go-wrapper shape obviously ports.
Limiting factor: the CLI's JSON output quality. Some CLIs
have partial or inconsistent JSON support; some wrap everything
in a single top-level blob that's hard for an LLM to navigate
without a further unwrap tool; some interleave log lines into
stdout alongside the JSON result. The smoother the --json, the
smaller the wrapper.
Trade-offs vs alternatives¶
vs. OpenAPI-spec-based MCP (Cloudflare's cf-cli framing — expose API directly as MCP tools): OpenAPI gives full-surface exposure automatically but explodes the tool count and the context-window cost. Wrap-CLI gives manual selection + built-in read-only cultural default, at the cost of only covering what the CLI already exposes.
vs. Code Mode (CF Code Mode — fit thousands of operations into one tool by giving the LLM a programming environment): Code Mode is the right answer at ~3000-op scale. Wrap-CLI is the right answer at <10-op scale with a <1-hour budget.
vs. HTTP/SSE MCP server: stdio wrappers don't multitenant, don't survive the client's lifetime, and inherit the operator's full CLI credentials. For operator-driven troubleshooting this is a feature, not a bug. For shared team or CI use, HTTP/SSE with patterns/session-affinity-for-mcp-sse is necessary.
Risks¶
- Local MCP server security (concepts/local-mcp-server-risk). The operator is giving a cloud LLM instance the ability to run native binaries on their workstation. Even a nominally read-only tool surface is one "let me try one more thing" prompt-injection away from misbehaviour. Ptacek's explicit caveat: "Local MCP servers are scary. I don't like that I'm giving a Claude instance in the cloud the ability to run a native program on my machine."
- Natural mitigation: patterns/disposable-vm-for-agentic-loop — run the wrapped CLI inside a throwaway Fly Machine / Cloud Hypervisor micro-VM / Firecracker sandbox, not on the operator's laptop. The Fly.io 2025-02-07 VSCode-SSH post sketches exactly this shape.
- LLM hallucination on novel incidents. The OOM-on-Bun case is nicely demonstrated but self-evidently well-represented in the training corpus. Accuracy on rarer failure shapes is not measured.
- No tool-call rate limiting in stdio mode. A poorly
prompted agent can spin on
fly logsof different machines; nothing in the wrapper caps cost.
Seen in¶
- sources/2025-04-10-flyio-30-minutes-with-mcp-and-flyctl — canonical instance (flymcp / 2 tools / 90 LoC Go / 30 min / unpkg incident-diagnosis demo).
- sources/2025-05-07-flyio-provisioning-machines-using-mcps
— mutation transition (~27 days later): the same flyctl
MCP server now exposes the full
fly volumessubcommand family (create / list / extend / fork / snapshots / destroy), shipped in flyctl v0.3.117. First wiki instance of the pattern crossing the read-only → production-mutation boundary. Load-bearing safety claim: CLI-level refusal invariants ("can't destroy a mounted volume") become the agent guardrail at zero cost — see patterns/cli-safety-as-agent-guardrail. Pair-post to the 2025-04-10 instance.
Related¶
- systems/fly-flyctl — the CLI being wrapped.
- systems/model-context-protocol — the protocol.
- concepts/agent-ergonomic-cli — Cloudflare's parallel framing for the same upstream design pressure.
- concepts/structured-output-reliability — what
--jsonbuys that makes the wrap trivial. - concepts/agentic-troubleshooting-loop — the usage pattern the wrapped tool surface enables.
- concepts/local-mcp-server-risk — the security posture this pattern runs into.
- patterns/tool-surface-minimization — sibling design rule.
- patterns/allowlisted-read-only-agent-actions — sibling blast-radius-bounding rule.
- patterns/disposable-vm-for-agentic-loop — the sandbox answer to the local-MCP security concern.
- patterns/plan-then-apply-agent-provisioning — the mutation-surface UX complement (the 2025-05-07 "Make it so" flow).
- patterns/cli-safety-as-agent-guardrail — the zero-cost guardrail that lets the wrapped CLI safely cross into mutation authority ("can't destroy a mounted volume").
- concepts/natural-language-infrastructure-provisioning — the parent UX posture the wrap enables on the mutation side.
- companies/flyio.