Introducing Agent Lee - a new interface to the Cloudflare stack¶
Summary¶
Cloudflare launched Agent Lee, an in-dashboard AI assistant that understands a user's Cloudflare account and can both troubleshoot and apply changes across the entire platform (DNS, Workers, SSL/TLS, R2, Registrar, Cache, Cloudflare Tunnel, API Shield, and more) from a single natural-language prompt. At launch it reports ~18,000 daily users and ~250,000 tool calls per day after an active beta — positioning the post as a production retrospective on agentic-UI architecture, not a vapor-ware announcement. The architectural reveal is three-part: (1) instead of handing MCP tool definitions to the model, Agent Lee uses Code Mode to convert the tools into a TypeScript API and asks the model to write code against it — better accuracy (LLMs see far more TypeScript than tool-call examples) and fewer round trips for multi-step tasks; (2) generated code executes against Cloudflare's own MCP server but is routed through a Durable Object that acts as a credentialed proxy — inspects the generated code, classifies it read vs write, proxies reads directly, and blocks writes until the user approves via an elicitation gate (API keys never appear in the generated code — they're held inside the DO and injected server-side); (3) the UI itself is generated — responses come back not only as text but as interactive charts, tables, architecture maps, etc. rendered on an adaptive grid the user can carve up and fill with natural-language component requests. Built on the same Agents SDK / Workers AI / Durable Objects / MCP infrastructure Cloudflare ships to customers — an explicit Cloudflare dogfooding shape alongside the 2026-04-20 internal AI engineering stack post.
Key takeaways¶
- Agent Lee is in-production at measurable scale. ~18,000 daily users and ~250,000 tool calls per day across DNS, Workers, SSL/TLS, R2, Registrar, Cache, Cloudflare Tunnel, API Shield, and more. Free-plan users can access it from Ask AI in the Cloudflare dashboard. (Source: "~18,000 daily users, executing ~250k tool calls per day across DNS, Workers, SSL/TLS, R2, Registrar, Cache, Cloudflare Tunnel, API Shield, and more.")
- Code Mode replaces direct MCP-tool prompting. "Rather than presenting MCP tool definitions directly to the model, Agent Lee uses Code Mode to convert the tools into a TypeScript API and asks the model to write code that calls it instead." Two quantitative rationales explicit in the post: (a) LLMs have seen much more real-world TypeScript than tool-call examples → higher accuracy; (b) multi-step tasks collapse into a single generated script whose intermediate results stay inside the sandbox — the model skips the tool-call-and-observe round-trips (see patterns/code-generation-over-tool-calls).
- A Durable Object is the enforcement layer, not the sandbox. The generated code goes to an upstream Cloudflare MCP server for sandboxed execution, but it travels through a DO that acts as a credentialed proxy. Before any call leaves, the DO classifies the generated code as read or write by inspecting the method and body. Read operations are proxied directly. Write operations are blocked until explicit approval through an elicitation gate. "The security boundary isn't just a sandbox that gets thrown away; it's a permission architecture that structurally prevents writes from happening without your approval." (See patterns/credentialed-proxy-sandbox, concepts/elicitation-gate.)
- API keys are never present in the generated code. Keys are held inside the Durable Object and injected server-side when the upstream call is made. The sandbox that executes the code never sees the credentials — it couldn't exfiltrate them if it wanted to. This is the structural difference between "sandboxed execution" (a container that gets torn down) and "credentialed proxy" (a credential boundary that code physically can't cross).
- The elicitation step is the gate, not a UX courtesy. "Write operations go through an elicitation system that surfaces the approval step before any code executes. Agent Lee cannot skip this step. The permission model is the enforcement layer, and the confirmation prompt you see is not a UX courtesy. It's the gate." The model cannot skip it by construction because the DO makes the read/write classification in proxy code the sandbox cannot see or modify (see concepts/elicitation-gate, patterns/credentialed-proxy-sandbox).
- The MCP surface is two tools, not 3,000. Cloudflare's own MCP server exposes just two tools to Agent Lee: "a search tool for querying API endpoints and an execute tool for writing code that performs API requests." Consistent with patterns/tool-surface-minimization — all ~3,000 Cloudflare HTTP API operations are reachable through the execute tool via Code Mode without inflating the model's context window with per-op tool schemas. Same context-cost logic surfaced in the 2026-04-13 CLI post (~3,000 ops fit into <1,000 tokens of MCP Code Mode context).
- Responses are rich UI, not just text. "As your dialogue evolves, the platform dynamically generates UI components alongside textual responses." Supported blocks include dynamic tables, interactive charts, architecture maps, and more. Users carve out space on an adaptive grid and describe what they want to see; the agent renders real data (e.g. an interactive 24-hour error-rate chart from the user's actual traffic) instead of navigating to a separate Analytics page. See patterns/dynamic-ui-generation.
- Dogfooded on customer primitives. Built on Agents SDK, Workers AI, Durable Objects, and Cloudflare's own MCP infrastructure — "We didn't build internal tools that aren't available to you — instead we built it with the same Cloudflare lego blocks that you have access to." Every limitation hit becomes a platform fix; every pattern that works becomes documentation. Companion to the internal AI stack post — that one covers Cloudflare's internal developer-agent stack, this one covers Cloudflare's public customer-facing agent stack.
- Quality + safety posture. Continuous measurement via: (a) evals for conversation success rate + information accuracy; (b) thumbs up/down signals; (c) tool-call-execution-success + LLM hallucination scorers; (d) per-product conversation-quality breakdown. "These systems help us improve Agent Lee over time while keeping users in control."
- Named roadmap beyond the dashboard. The post articulates three future axes: (a) surface agnosticism — Agent Lee as the interface to Cloudflare from dashboard, CLI, phone; (b) proactivity — "rather than waiting to be asked, it watches what matters to you… and reaches out when something warrants attention"; (c) accumulated context — "what you've asked before, what page you're on, what you were debugging last week." "An agent that only responds is useful. One that notices things first is something different."
Systems / concepts / patterns extracted¶
- Systems: systems/agent-lee, systems/code-mode, systems/model-context-protocol, systems/cloudflare-durable-objects, systems/cloudflare-workers, systems/cloudflare-ai-gateway
- Concepts: concepts/elicitation-gate, concepts/agent-context-window, concepts/tool-selection-accuracy
- Patterns: patterns/code-generation-over-tool-calls, patterns/credentialed-proxy-sandbox, patterns/dynamic-ui-generation, patterns/tool-surface-minimization
Operational numbers¶
- ~18,000 daily users on Agent Lee during the active beta.
- ~250,000 tool calls / day executed by the agent on behalf of users.
- 2 MCP tools exposed to the agent (search + execute) instead of the full ~3,000-operation Cloudflare API surface directly.
- Free-plan availability — Agent Lee ships to every Cloudflare customer tier.
Caveats¶
- The post is a launch + architecture article, not a deep postmortem. Latency numbers, per-tool success rates, and approval-prompt abandonment rates are not disclosed.
- The Code Mode link in the post points at a separate earlier Cloudflare Code Mode write-up (blog.cloudflare.com/code-mode/) — treated in this wiki as the canonical Code Mode source; details of the code-sandbox substrate (isolate reuse, timeout model, cross-request contamination) come from that post, not this one.
- The post carefully distinguishes the sandbox boundary (what the generated code is allowed to observe) from the credential boundary (what it can ever access) — the credential boundary is load-bearing, the sandbox is additional defence-in-depth. Readers who collapse the two misread the security posture.
- "Beta … you may encounter unexpected limitations or edge cases" — the team expects production calibration to continue; treat today's numbers as a snapshot.
- The post notes that Cloudflare plans a CLI surface for Agent Lee next; as of publication the agent lives only in the dashboard.
Source¶
- Original: https://blog.cloudflare.com/introducing-agent-lee/
- Raw markdown:
raw/cloudflare/2026-04-15-introducing-agent-lee-a-new-interface-to-the-cloudflare-stac-5174dae2.md
Related¶
- sources/2026-04-20-cloudflare-internal-ai-engineering-stack — internal-developer-agent counterpart; same primitives (AI Gateway
- Workers AI + MCP) but with a different enforcement layer (proxy Worker + AGENTS.md + multi-agent code reviewer) targeted at employee code-authoring vs customer dashboard ops.
- sources/2026-04-13-cloudflare-building-a-cli-for-all-of-cloudflare — explains how Cloudflare's ~3,000-operation API surface fits into <1,000 MCP tokens via Code Mode; Agent Lee is the consuming application of that compression.
- sources/2026-01-29-cloudflare-moltworker-self-hosted-ai-agent — sibling reference-architecture-as-blog-post: Moltworker ports an external Docker AI agent to Cloudflare primitives; Agent Lee is Cloudflare's own first-party agent built on the same substrate.