PATTERN Cited by 1 source

Additive capability ladder¶

Pattern¶

Structure an agent's (or any untrusted-code consumer's) execution environment as a ladder of capability tiers, where:

The bottom tier is genuinely useful on its own for a non-trivial fraction of tasks.
Each higher tier is additive — it adds a specific new capability (network access, npm, headless browser, full OS sandbox) without removing anything from below.
State at the bottom is shared upward — files written to the Tier-0 workspace are visible from any higher tier.
Escalation is task-driven, not configured up-front. An agent session may touch multiple tiers as needed.
Each capability grant is explicit and auditable — escalating is a discrete decision, reviewable after the fact.

The pattern treats capability as a ladder rather than a menu: an agent doesn't pick "which environment" up front; it starts at the bottom and escalates as the task demands.

Canonical instance: Cloudflare Project Think¶

"This capability model leads naturally to a spectrum of compute environments, an execution ladder that the agent escalates through as needed… The key design principle: the agent should be useful at Tier 0 alone, where each tier is additive." (Source: Project Think launch post.)

Tier	Capability added	Substrate
0	Durable virtual filesystem (read, write, edit, search, grep, diff)	DO SQLite + R2, `@cloudflare/shell`
1	LLM-generated JavaScript in sandboxed isolate, no network	Dynamic Workers + `@cloudflare/codemode`
2	+ `npm` at runtime	`@cloudflare/worker-bundler` + esbuild
3	+ headless browser	Browser Rendering
4	+ full OS sandbox (`git`, `cargo`, test runners)	Cloudflare Sandbox

Tier 4's filesystem is bidirectionally synced with the Tier-0 workspace — progress made at the bottom tier doesn't have to be re-made at the top.

Why additive matters¶

A non-additive "choose an environment" design forces a single up-front decision. If the agent might need a browser on turn 5, the developer provisions Tier 3 for every turn — paying the cost of the higher tier even when turns 1-4 only need Tier 0. Worse, it normalises the higher-tier capability as the default, and drift leads to "all agents run at max tier".

Additive means:

Cost is proportional to need. A turn at Tier 0 pays Tier-0 cost, not Tier-4 cost.
Audit is sharper. Seeing an agent at Tier 3 is a discrete event — the developer or the runtime logged a deliberate escalation.
Capabilities are auditable by their absence. An agent session that never hit Tier 3 demonstrably never opened a browser.
Minimum-useful-tier is the baseline for user trust. "The agent must be useful at Tier 0" is a user-visible property — users who only audit the bottom tier know what the minimum agent can do.

Prerequisites¶

A capability-based substrate at each tier (no ambient authority; capabilities granted by binding). See concepts/capability-based-sandbox.
Shared state at the bottom that higher tiers can reach without tier-switching. Project Think's DO-SQLite-plus-R2 workspace at Tier 0 is the canonical instance.
Explicit escalation APIs — the agent (or developer) writes createBrowserTools(env.BROWSER) to add Tier 3; no Tier 3 if the call isn't there.
A genuinely useful Tier 0. If the bottom tier is a nothing-environment, agents will default to the top and the ladder collapses.

Design axes¶

Tier count and spacing. Too few tiers and each jump is too large; too many and the ladder becomes ceremony. Project Think's five tiers map to five qualitatively-different capability classes (no-network → network via npm → browser → full OS).
State-sharing policy. Fully-shared (Tier 4 sees the Tier-0 workspace) vs tier-local. Project Think chooses fully-shared; the sync has operational cost but enables multi-tier tasks.
Which tier is the default. Project Think's Think base class wires in Tier 0 + Tier 1 by default; Tiers 2-4 are opt-in via explicit tool-set registration.
Escalation authority. Who decides to move up: the model (autonomous), the developer at deploy time (static), or a gated approval flow (governance)? Project Think is currently developer-static at the binding layer.
Billing + quota per tier. Higher tiers are more expensive; the ladder framing makes per-tier budgets a natural metric.

Examples beyond Project Think¶

The pattern generalises — any product that exposes untrusted or semi-trusted code execution can benefit:

Notebook products (Jupyter, Colab): sandboxed kernel at Tier 0, package install at Tier 1, internet access at Tier 2, GPU at Tier 3, shell at Tier 4.
CI systems: ephemeral filesystem at Tier 0, language runtime at Tier 1, dependency cache at Tier 2, container access at Tier 3, privileged Docker at Tier 4.
Developer sandboxes: read-only code at Tier 0, language REPL at Tier 1, network at Tier 2, package manager at Tier 3, shell at Tier 4.

Most existing products pick one "tier" and call it the environment. The ladder framing is a re-organisation, not a new substrate.

When it fits¶

Agent / automated-code-execution products where tasks vary widely in required capability.
Multi-tenant compute where different users / plans warrant different tiers.
Environments where audit-trail of capability use is security-relevant.

When it doesn't¶

Tasks that all fall into one tier — ladder overhead isn't justified.
Environments where tier switches are too expensive (state not shared, restart required) — the escalation cost collapses the gradualism premise.

Seen in¶

sources/2026-04-15-cloudflare-project-think-building-the-next-generation-of-ai-agents — canonical wiki instance. Names the ladder concept, gives the five-tier realisation, states the "useful at Tier 0" principle.

concepts/execution-ladder — the concept-page articulation of the same shape.
concepts/capability-based-sandbox — the per-tier enforcement substrate.
concepts/least-privileged-access — the security principle this pattern realises at multi-tier granularity.
systems/project-think / systems/dynamic-workers / systems/cloudflare-browser-rendering / systems/cloudflare-sandbox-sdk / systems/cloudflare-r2 — the per-tier implementations.
patterns/credentialed-proxy-sandbox — orthogonal pattern (credential-space), composes with the ladder (capability- space).
companies/cloudflare — operator.