PATTERN Cited by 4 sources
Tool-surface minimization¶
Tool-surface minimization is the discipline of keeping the number of tools an agent sees small, because (a) tool-calling accuracy degrades as the tool inventory grows (arXiv 2411.15399 cited by Datadog) and (b) every tool's description consumes context window budget. "Just turn every API endpoint into a tool" does not scale (Source: sources/2026-03-04-datadog-mcp-server-agent-tools).
Datadog uses three tactics together, each with distinct trade-offs:
1. Flexible tools¶
Rather than one tool per API endpoint, design tools whose schema covers multiple use cases. Requires careful schema design; the payoff is one well-designed tool replacing several narrow ones.
- Pair with patterns/query-language-as-agent-tool — a single
SQL tool subsumes many
get_X_by_Yendpoints.
2. Toolsets¶
A core (default) toolset covers common workflows and loads automatically on agent connection. Opt-in toolsets expose specialized workflows when the user enables them.
- Cost: users must anticipate what the agent will need ahead of time. Wrong forecast → agent lacks a capability at the moment it needs it.
- Datadog's documented toolsets API is the product surface.
3. Layering (tool chaining)¶
A discovery tool ("how do I accomplish X?") answers capability questions and returns the name/identifier of a second-tier tool that actually does the work. The second-tier tool is not front- loaded into the description budget until the discovery tool points at it.
- Named precedent: Block's "Build MCP tools like ogres with layers".
- Cost: +1 tool call of latency per task — a previously 1-call task becomes 2. Meaningful on interactive agent sessions.
How the tactics compose¶
| Tactic | Savings target | Cost |
|---|---|---|
| Flexible tools | Both accuracy (fewer names) + context (fewer descriptions) | Schema-design complexity |
| Toolsets | Context (only loaded tools cost budget) | Requires user forecast |
| Layering | Context (specialized tools hidden behind discovery) | +1 call latency |
Decaying constraint?¶
Client-side features soften this pressure without removing it: Claude Code's tool search avoids loading every tool up front; Claude skills and Kiro Powers let agents load specialized knowledge on demand. How skills and MCP compose is "still an open question" per the Datadog post. Structural minimization remains the default; client features widen the budget, they don't eliminate the discipline.
Seen in¶
- sources/2026-03-04-datadog-mcp-server-agent-tools — all three tactics are used together on Datadog's MCP server.
- sources/2025-11-17-dropbox-how-dash-uses-context-engineering-for-smarter-ai — Dropbox Dash confirms the pattern with a different tactic: index-side consolidation (patterns/unified-retrieval-tool). Many app-specific retrieval tools (Confluence / Google Docs / Jira / …) collapsed into one retrieval tool backed by the unified Dash Search Index; the same discipline ships outward as systems/dash-mcp-server for Claude / Cursor / Goose. Named failure mode before minimization: "analysis paralysis" (concepts/tool-selection-accuracy) — the model spent compute on picking a retrieval tool instead of acting. Reinforces the Datadog observation that tool-inventory growth degrades tool-selection accuracy; adds unified-retrieval-tool as a fourth tactic alongside flexible tools / toolsets / layering when the underlying domain is cross-source retrieval.
- sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash — Josh Clemm's companion talk gives the pattern its Dash-native name — "super tool" — and enumerates Dash's four-part MCP-scale minimization bundle: (1) one super-tool over the index (patterns/unified-retrieval-tool); (2) knowledge-graph-derived digests to cut result tokens (patterns/precomputed-relevance-graph); (3) store tool results locally rather than inline them into the context window — a new lever not in the Datadog 5-pattern or 2025-11-17 3-principle lists; (4) classifier-routed sub-agents with narrow tool sets (patterns/specialized-agent-decomposition). Quantitative framing: simple MCP queries "up to 45 seconds" vs "within seconds" against the raw index; Dash caps its context at ~100k tokens.
- sources/2025-11-06-flyio-you-should-write-an-agent — Fly.io (Thomas Ptacek) gives the first-principles argument for the pattern at the minimal-agent layer: "You're allotted a fixed number of tokens in any context window. Each input you feed in, each output you save, each tool you describe, and each tool output eats tokens." Frames the degradation as "nondeterministically stupider" past a token threshold. Canonical statement of concepts/context-window-as-token-budget. Points at context-segregated sub-agents (patterns/context-segregated-sub-agents) as the natural composition move when one context's tool surface can no longer be minimised further — spawn a child context with a different tool subset. Complements the Datadog/Dash mitigations: Datadog and Dash minimise within one agent; Fly.io adds the across-agents lever (split the tool surface across sub- agents so no single context carries the full inventory).