Skip to content

PATTERN Cited by 1 source

Code Mode MCP for data agent

A pattern for exposing a data agent's tool surface to its LLM that replaces the standard "30 individual MCP tools, one round-trip per call" shape with two meta-tools (search, execute) plus a JavaScript runtime. The model writes a JavaScript snippet that calls the data agent's full toolset programmatically; the snippet runs in a sandboxed isolate and returns the final result.

This is the application of Cloudflare's Code Mode to the data-agent domain specifically — turning multi-step analytical workflows (search datasets → describe → start query → fetch results → render chart) into a single snippet, executed in one round-trip.

Cloudflare Skipper is the canonical wiki instance, from the Town Lake / Skipper launch post.

The architectural argument

Standard MCP-tool pattern:

"This is fine, but it is chatty: a five-tool workflow is five model round-trips, each of which has to re-establish context."

Code Mode pattern:

"For our MCP server, we use Code Mode. Instead of defining 30 individual tools, we expose two: search and execute. The model writes a JavaScript snippet that calls our entire toolset programmatically:

const datasets = await skipper.search_datasets({ query: "billing product revenue" })
const queryId = await skipper.start_query({ sql: "SELECT ..." })
const results = await skipper.fetch_results({ queryId, mode: "inject" })
return skipper.create_chart({ chartType: "bar", data: results.rows, ... })

That JavaScript runs in a sandboxed Dynamic Worker isolate via WorkerLoader. The model gets to express complex multi-step workflows in a single round-trip, in a language it already knows extremely well. It's faster, it's cheaper, and the workflows it produces are auditable as code."

Three structural benefits

Benefit Mechanism
Faster Five-step workflow → one model round-trip
Cheaper Token cost: two-tool schema vs 30-tool schemas; round-trip cost: 1× vs N×
Auditable The full workflow is captured as a single code artefact, not split across N tool-call traces

The auditability claim is non-trivial. With per-tool MCP, each call is logged separately and reconstructing the agent's full intent requires stitching the trace. With Code Mode, the snippet is the intent — read it once, see the entire plan.

Two-tool meta-surface

Tool Role
search Browse / find — returns identifiers the model can pass to execute
execute Run code — accepts a JavaScript snippet that calls the data agent's underlying tool API

The search / execute decomposition mirrors the discover-then-act shape of analytical work. The data agent's real tool surface (search_datasets, get_entity_details, start_query, fetch_results, create_chart, build_dashboard, check_access, etc.) is reachable from inside the execute runtime.

Sandboxed Dynamic Worker isolate via WorkerLoader

The runtime substrate is Dynamic Workers — Cloudflare's V8-isolate-based sandbox primitive. WorkerLoader is the API that loads the agent-generated JavaScript into a fresh isolate per execution.

Why isolates over containers:

  • Boot time: milliseconds vs seconds.
  • Memory: a few MB vs hundreds of MB.
  • Scale envelope: tens of thousands of concurrent agents.

The trade-off (canonicalised at concepts/isolate-vs-microvm-for-agent-sandbox): isolates lose some Linux-tooling fidelity but gain orders-of-magnitude better agent-density economics. For data-agent workloads — JavaScript glue between API calls — isolates are the right substrate.

Composes with tool overlap is poison

Code Mode is the architectural extreme of "every tool has a single reason to exist" — applied at the meta-level. Instead of 30 specialised tools, expose two meta-tools that compose all 30. Each underlying tool can still have its own focused contract; the agent doesn't pick from 30 at runtime, it composes from 30 at code-gen time.

This is the same lesson at two scales:

  • Per-domain: collapse three fetch_results_* tools into one with a mode parameter.
  • Across-domain: collapse 30 tools into two meta-tools plus a JavaScript runtime.

When this pattern fits

  • Multi-step workflows — the data agent typically chains 3–10 tool calls per question; each call's input depends on the previous output. Code Mode collapses the chain.
  • Stable underlying APIs — the JavaScript snippet calls a TypeScript-typed underlying API; if the API churns frequently, the agent's code-gen has to keep up.
  • Audit-sensitive workflows — when "what did the agent do?" needs to be answerable as a single readable artefact.

When this pattern doesn't fit

  • Single-tool tasks — for "just look up X", the per-tool round-trip cost is irrelevant; Code Mode's overhead exceeds the saving.
  • Streaming results — Code Mode runs to completion in a sandbox; if the user wants to see partial results as the workflow progresses, per-tool calls with streaming output are better.
  • Long-running synchronous calls — sandbox runtime caps and the requirement to hold the round-trip open make this awkward.

Sibling patterns

Seen in

Last updated · 542 distilled / 1,571 read