Skip to content

CONCEPT Cited by 1 source

Self-authored extension

Definition

A self-authored extension is a tool that an AI agent writes for itself at runtime — as source code, bundled + loaded into a sandbox with an explicit permission declaration — and then calls on subsequent turns as if it were an ordinary tool. Canonical wiki instance: Project Think's ExtensionManager + Dynamic-Worker-hosted extensions, introduced in the 2026-04-15 Project Think launch (Source: Project Think post).

The extension artifact

An extension is a bundle containing TypeScript code + a manifest declaring what capabilities it needs. Project Think's shape:

{
  "name": "github",
  "description": "GitHub integration: PRs, issues, repos",
  "tools": ["create_pr", "list_issues", "review_pr"],
  "permissions": {
    "network": ["api.github.com"],
    "workspace": "read-write"
  }
}

The tools array defines the tool names the extension exposes to the agent loop — exactly analogous to MCP tools, but authored by the agent rather than for it.

Lifecycle

  1. Author. The agent writes the TypeScript implementation (typically in response to a user request) and the manifest.
  2. Bundle. ExtensionManager bundles the code — optionally fetching npm dependencies via @cloudflare/worker-bundler + esbuild.
  3. Load. The bundle is loaded into a Dynamic Worker with only the permissions declared in the manifest.
  4. Register. The manager publishes the new tools to the agent loop. "The next time the user asks about pull requests, the agent has a github_create_pr tool that didn't exist 30 seconds ago."
  5. Persist. The extension lives in the agent's Durable Object storage and survives hibernation — available on the next wake-up without re-authoring.
  6. Revoke (implicit). Removing the extension from DO storage removes the tools.

Why this is different from "agent writes and runs code in a

sandbox"

Per-turn code generation ( Code Mode) writes a fresh script each turn and throws it away after execution. The self-authored extension instead persists — the agent's capability surface grows across turns / sessions as the model ships new TypeScript tools for itself.

That persistence is what makes the pattern a self-improvement loop, in Cloudflare's framing:

"This is the kind of self-improvement loop that makes agents genuinely more useful over time. Not through fine-tuning or RLHF, but through code. The agent is able to write new capabilities for itself, all in sandboxed, auditable, and revocable TypeScript."

Structural invariants that make it safe (claimed)

  • Capability-model sandbox — the extension runs in a Dynamic Worker with globalOutbound: null; its manifest's permissions block is the entire capability grant. See concepts/capability-based-sandbox.
  • Auditable — the extension is source code in DO storage. list_extensions + read_extension shape makes what the agent is capable of doing legible.
  • Revocable — removing the extension removes the capability. No running process to kill.
  • Sandboxed execution — the extension runs in the same capability-enforced substrate the agent's per-turn code uses. No elevation.

Open questions (not resolved in the post)

  • Cross-session accumulation policy. Which extensions the agent wrote in one user's session persist across users, across accounts, across upgrades of the agent's base prompt? Project Think doesn't address; the extensions persisting in the agent DO's storage implies per-agent-instance scope (which is per-user-per-task by the one-to-one-agent-instance premise).
  • Approval flow. The post describes the capability grant as a manifest entry, not as a user approval. Whether the user explicitly confirms "agent wants network: ['api.github.com'], OK?" is undecided; the credentialed-proxy pattern or an elicitation gate seems the natural composition for high-stakes permissions.
  • Integrity / supply-chain. Extensions can pull npm packages at bundle time — the resulting extension carries transitive dependency code the user never saw. Typosquatting and dependency-confusion attacks are live concerns the post doesn't address.
  • Versioning + conflict. Two extensions might register tools with colliding names; upgrades might change tool semantics under the model. Governance unspecified.
  • Distillation to base model. "Not through fine-tuning or RLHF, but through code" — but an extension written once has to be re-written or re-loaded for every new agent instance unless shared. The accumulation story requires a capability store that isn't just per-DO.

When the pattern fits

  • Agent products with a long-lived per-user session surface (coding assistants, enterprise dashboard agents, personal research agents) — the marginal cost of writing an extension amortises over many re-uses.
  • Integrations where a wrapping layer around a third-party API is stable enough to re-use — the extension is effectively a cached MCP server with agent-visible source.

When it doesn't

  • Short-lived one-shot agents — writing an extension costs a turn that a fresh per-turn code-gen call would have done in-line.
  • Capability-sensitive products where agent-authored code must pass human review before getting permissions — the autonomy of the loop is the thing that's disqualified.

Seen in

Last updated · 200 distilled / 1,178 read