Skip to content

CONCEPT Cited by 1 source

Agent-driven browser

Definition

Agent-driven browser is an agent-tooling pattern where an LLM agent controls a full browser (DOM + JavaScript state + navigation + event injection + network + console) as a first-class tool, rather than iterating on rasterised screenshots of a UI. The agent sees the same page-state surface a developer would see in Chrome DevTools, programmatically.

Canonical wiki statement

Fly.io, 2025-06-20, on Phoenix.new:

Phoenix.new includes, in both its UI and its agent tools, a full browser. The Phoenix.new agent uses that browser 'headlessly' to check its own front-end changes and interact with the app. Because it's a full browser, instead of trying to iterate on screenshots, the agent sees real page content and JavaScript state – with or without a human present.

(Source: sources/2025-06-20-flyio-phoenixnew-remote-ai-runtime-for-phoenix)

The load-bearing clause is "instead of trying to iterate on screenshots, the agent sees real page content and JavaScript state."

What the signal difference buys

A screenshot is a rasterisation — pixels. An agent looking at one is doing image-to-text on the DOM, inferring structure. That works for simple pages but fails on:

  • Dynamic content — an ambient loading spinner looks the same pixel-wise whether the network request succeeded or failed.
  • Form state — whether an input is validated, whether a button is disabled, whether a dropdown is open, is all attribute- level DOM state that doesn't necessarily show up visually.
  • JavaScript errors — an exception logged to console but not surfaced to the user is invisible in a screenshot; it's one console-log access away in a full browser.
  • Network state — XHR / WebSocket traffic is a key signal for LiveView / Phoenix Channels-heavy apps and lives entirely outside the screenshot.

Three-signal fusion

On Phoenix.new specifically, the agent fuses three signal streams from the same running session:

  1. Browser DOM + JS state (this concept) — via a CDP-driven full Chrome the agent operates.
  2. Server-side application logs — the Phoenix app running on the same VM streams logs to the agent.
  3. Test runner outputmix test exit codes and assertion diffs.

The 2025-06-20 post: "When Phoenix.new boots an app, it watches the logs, and tests the application. When an action triggers an error, Phoenix.new notices and gets to work." The browser + logs + tests triangulate a failure in a way any single signal can't.

Contrast with adjacent shapes

  • Screenshot-iterating agents (early Cursor agents, some Devin-like surfaces) — rasterised feedback; lower signal density.
  • CDP-over-network agents (Cloudflare Browser Rendering — systems/cloudflare-browser-rendering — via patterns/cdp-proxy-for-headless-browser) — same signal surface; the browser lives in a tenant-scoped service rather than colocated in the session VM. Cloudflare's MoltWorker (2026-01-29) is the canonical proxied instance; Phoenix.new (2025-06-20) is the colocated instance.
  • Playwright-MCP agents — same CDP signal surface, delivered via an MCP tool. Narrower interface (specific high-level actions) vs. Phoenix.new's "agent drives CDP directly in the same VM".

Caveats

  • The post says "agent tools" plural without enumerating which specific browser APIs (DOM query? eval? network interception? console tail?) the agent has access to. The disclosure is at the capability-category level.
  • Context-window cost. DOM dumps are much larger than screenshots. A full-page DOM on a rich app is easily tens of KB of tokens. Real agent prompts likely sample ("just the button's disabled attribute") rather than dumping.
  • CDP driving works great for happy-path CSS / HTML assertions but is clumsy for visual regressions (pixel-level diff of a chart rendering) — screenshots still matter for that slice.
  • Phoenix.new's UI explicitly exposes the browser as a live preview, so the human can watch the agent drive it. That's a UX affordance on top of the agent-tooling posture.

Seen in

Last updated · 200 distilled / 1,178 read