Skip to content

SYSTEM Cited by 1 source

Phoenix.new

Phoenix.new is Fly.io's in-browser coding agent tailored to Elixir's Phoenix framework, introduced 2025-06-20 in a blog post by Chris McCord (Phoenix's creator) after a months-long skunkworks effort at Fly.io. Every session runs in its own ephemeral Fly Machine — a Firecracker micro-VM — with a shared root shell between the developer and the agent, an integrated full (not headless-only) Chrome browser the agent drives to verify its own front-end changes, preview URLs under phx.run produced automatically from any bound port via integrated port-forwarding, and the GitHub gh CLI pre-installed so the agent can clone repos, browse issues, and open PRs.

Architecture (from the 2025-06-20 post)

  • Ephemeral per-session VM. A browser client opens a session → a Fly Machine boots → the user enters a VSCode-style interface with a "shell button" that drops them onto the same VM the agent runs in. "You just open up the VSCode interface, push the shell button, and there you are, on the isolated machine you share with the Phoenix.new agent."
  • Root-shell co-tenancy. The developer and the agent share the shell. The agent can mix deps.get, mix phx.server, mix test — and if it wants to apt install a base-OS package, it does that too. Containment comes from the Firecracker micro-VM boundary, not from a narrow per-tool permission system. See concepts/agent-with-root-shell.
  • Full browser as agent tool. Phoenix.new's VM runs a real Chrome the agent drives headlessly for its own iteration; the browser is simultaneously exposed in the UI as a live preview the developer watches. "Because it's a full browser, instead of trying to iterate on screenshots, the agent sees real page content and JavaScript state – with or without a human present." Canonical concepts/agent-driven-browser instance.
  • Closed log + browser + test loop. "When Phoenix.new boots an app, it watches the logs, and tests the application. When an action triggers an error, Phoenix.new notices and gets to work." Three-signal closed loop (server logs + browser DOM/JS state + test exit codes); canonical three-way concepts/agentic-development-loop instance.
  • *.phx.run preview URLs. "We detect anything the agent generates with a bound port and give it a preview URL underneath phx.run, with integrated port-forwarding." Any port bound inside the VM becomes a shareable preview URL on a platform-owned subdomain without a deploy step. Canonical concepts/ephemeral-preview-url instance. Other .phx.run tabs the developer has open update live as the agent works.
  • GitHub integration via gh CLI. gh is pre-installed; authorising it for internal repos gives the agent reach into the user's existing team projects, dependencies, issues, and PR workflows.
  • Platform substrate. Apps inherit "all the infrastructure guardrails of Fly.io: hardware virtualization, WireGuard, and isolated networks" — the session VM is a normal Fly Machine on Fly's normal network posture.

What Phoenix.new ships on top of the substrate

The VM image + the agent (system prompt + tool set + loop). The post doesn't disclose the agent-loop internals (which model family, prompt chain, tool-schema, retry policy). What it does disclose:

  • Elixir/Phoenix expertise in the system prompt: the agent handles Channels, Phoenix's Presence, LiveView, Ecto schemas, and "real databases".
  • Database-aware onboarding. Given a DATABASE_URL, the agent uses psql to explore schemas, proposes apps based on what it finds, and models Ecto schemas off the live DB. "If MySQL is your thing, the agent will just apt install a MySQL client and go to town."
  • Language-agnostic under the tuning. "Our system prompt is tuned for Phoenix today, but all languages you care about are already installed." Rails / Expo React Native / Svelte / Go work out of the box; new language / framework tuning is on the roadmap.

How Phoenix.new fits on this wiki

Contrast with adjacent shapes

  • vs screenshot-iterating coding agents (early Cursor agents, Devin-like surfaces): Phoenix.new's agent sees DOM + JS state + network + server logs, not rasterised snapshots.
  • vs capability-sandboxed runtimes (Cloudflare Project Think Dynamic Workers): Phoenix.new reaches containment through a whole-OS disposable perimeter (root shell, ephemeral VM); Project Think reaches it through a capability manifest scoping a long-lived runtime. Both are valid answers to "how do you let an agent run arbitrary code safely".
  • vs long-lived cloud IDEs (GitHub Codespaces, Gitpod): both are browser-delivered remote-VM IDEs (concepts/cloud-ide), but Codespaces / Gitpod are primarily human-centered with a persistent workspace; Phoenix.new is agent-centered with a per-session ephemeral VM. The primary user is the agent.

Caveats

  • Product-launch post; no operational numbers (VM boot time, per-session cost, concurrency ceiling, PR-merge rate).
  • Agent-loop internals (model, prompt chain, tool-schema, retry policy, context-window management) not disclosed.
  • Preview-URL routing path (how *.phx.run maps to session Machines via fly-proxy + port-forward) not sketched.
  • Security-model for the agent-with-root-shell posture described at the VM-boundary level only; no threat-model for what an attacker who subverts the agent can reach over WireGuard during the session window; no disclosure of gh token scoping.
  • Language-agnostic claim is about toolchains installed, not agent competence across stacks.

Seen in

Last updated · 200 distilled / 1,178 read