Skip to content

SYSTEM Cited by 1 source

Project Think

Project Think (published 2026-04-15 as @cloudflare/think) is Cloudflare's "next generation of the Agents SDK" — a set of primitives for building long-running AI agents on top of Cloudflare's existing developer platform, plus an opinionated base class Think that wires them together. It ships in preview (Source: Project Think launch post).

Thesis

"Agents are one-to-one." A traditional app serves many users from one instance; an agent doesn't — "a restaurant has a menu and a kitchen optimized to churn out dishes at volume. An agent is more like a personal chef." 100 M knowledge workers × modest concurrency = tens of millions of simultaneous sessions at per-container cost — "unsustainable." Project Think's substrate — Durable Objects + Dynamic Workers + hibernation + SQLite + capability-model sandbox — is the architectural answer to that scaling premise. See concepts/one-to-one-agent-instance.

Six primitives

Each is usable directly with the plain Agent base class; Think just wires them together.

1. Durable execution with fibers

runFiber("name", async (ctx) => { … }) registers a durable function invocation in SQLite before execution begins, checkpointable at any point via ctx.stash({ … }), recoverable on restart via onFiberRecovered. SDK keeps the agent alive automatically during fiber execution. keepAlive() / keepAliveWhile() prevents eviction on minute-scale work; for hour-to-day operations the agent persists a job ID, hibernates, wakes on callback.

Worked example from the post:

export class ResearchAgent extends Agent {
  async startResearch(topic: string) {
    void this.runFiber("research", async (ctx) => {
      const findings = [];
      for (let i = 0; i < 10; i++) {
        const result = await this.callLLM(`Research step ${i}: ${topic}`);
        findings.push(result);
        ctx.stash({ findings, step: i, topic });   // checkpoint
        this.broadcast({ type: "progress", step: i });
      }
      return { findings };
    });
  }
  async onFiberRecovered(ctx) {
    if (ctx.name === "research" && ctx.snapshot) {
      const { topic } = ctx.snapshot;
      await this.startResearch(topic);
    }
  }
}

See concepts/durable-execution, patterns/checkpoint-resumable-fiber.

2. Sub-agents via Facets

this.subAgent(ChildClass, "name") returns a child DO colocated with the parent via Facets. Each sub-agent gets its own isolated SQLite and execution context; there's no implicit sharing of data between them. TypeScript catches misuse at compile time. "Sub-agent RPC latency is a function call."

const researcher = await this.subAgent(ResearchAgent, "research");
const reviewer = await this.subAgent(ReviewAgent, "review");
const [research, review] = await Promise.all([
  researcher.search(task),
  reviewer.analyze(task)
]);

See patterns/colocated-child-actor-rpc.

3. Persistent Sessions — tree-structured messages

SessionManager.create(this) returns a session whose messages are stored as a tree; each message has a parent_id. Supports:

  • Forking (explore an alternative without losing the original path): this.sessions.fork(session.id, messageId, "alternative").
  • Non-destructive compaction — summarise older messages; full history remains in SQLite.
  • Full-text search across conversation history via FTS5. The agent itself can query past sessions via a built-in search_context tool.

See patterns/tree-structured-conversation-memory.

4. Sandboxed code execution

Dynamic Workers as the sandbox: LLM-generated JavaScript in a fresh V8 isolate with globalOutbound: null (no ambient authority). @cloudflare/codemode is the shape on top of Dynamic Workers; the Cloudflare API MCP server demonstrates it at scale — two tools (search(), execute()) consume ~1,000 tokens vs ~1.17M tokens for the naive tool-per-endpoint equivalent — a 99.9% reduction. @cloudflare/worker-bundler fetches packages from npm at runtime, bundles with esbuild, loads into the Dynamic Worker — agent writes import { z } from "zod" and it just works. See systems/code-mode, patterns/code-generation-over-tool-calls.

5. The execution ladder

Tiered capability escalation — "the agent should be useful at Tier 0 alone, where each tier is additive."

Tier Capability Powered by
0 Durable virtual filesystem (read, write, edit, search, grep, diff) DO SQLite + R2, @cloudflare/shell
1 LLM-generated JavaScript, no network Dynamic Workers + @cloudflare/codemode
2 + npm at runtime (import { z } from "zod" works) @cloudflare/worker-bundler + esbuild
3 + headless browser Browser Rendering
4 + full OS sandbox: git clone, npm test, cargo build Cloudflare Sandbox

See concepts/execution-ladder, patterns/additive-capability-ladder.

6. Self-authored extensions

The agent writes its own TypeScript tool in a Dynamic Worker, declares permissions ({network: ["api.github.com"], workspace: "read-write"}), and ExtensionManager bundles (optionally with npm), loads into a Dynamic Worker, registers the new tools. Extensions persist in DO storage and survive hibernation. "The next time the user asks about pull requests, the agent has a github_create_pr tool that didn't exist 30 seconds ago." See concepts/self-authored-extension.

The Think base class

Minimal subclass is enough to get a working durable chat agent:

import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";

export class MyAgent extends Think<Env> {
  getModel() {
    return createWorkersAI({ binding: this.env.AI })(
      "@cf/moonshotai/kimi-k2.5"
    );
  }
}

That gives streaming, persistence, abort/cancel, error handling, resumable streams, and a built-in workspace filesystem. Deploy with npx wrangler deploy.

Overridable hooks

Hook Purpose
getModel() Return the LanguageModel to use
getSystemPrompt() System prompt
getTools() AI-SDK-compatible ToolSet for the agentic loop
maxSteps Max tool-call rounds per turn
configureSession() Context blocks, compaction, search, skills
beforeTurn() Turn-start hook
beforeToolCall() / afterToolCall() Per-tool wrappers
onStepFinish() Per-step summary
onChatResponse() Turn-end hook

Per-turn agentic loop (from the post):

beforeTurn()
  → streamText()
    → beforeToolCall()
    → afterToolCall()
  → onStepFinish()
→ onChatResponse()

Context blocks

Structured system-prompt sections the model can read + update over time, persisted across hibernation. The model sees token accounting inline — "MEMORY (Important facts, use set_context to update) [42%, 462/1100 tokens]" — and can proactively remember things.

configureSession(session: Session) {
  return session
    .withContext("soul", {
      provider: { get: async () => "You are a helpful coding assistant." }
    })
    .withContext("memory", {
      description: "Important facts learned during conversation.",
      maxTokens: 2000
    })
    .withCachedPrompt();
}

Tool wiring

import { createWorkspaceTools } from "@cloudflare/think/tools/workspace";
import { createExecuteTool } from "@cloudflare/think/tools/execute";
import { createBrowserTools } from "@cloudflare/think/tools/browser";
import { createSandboxTools } from "@cloudflare/think/tools/sandbox";
import { createExtensionTools } from "@cloudflare/think/tools/extensions";

A single getTools() returns the whole ladder wired in.

Sub-agent via RPC

Think works as a sub-agent too — called via chat() over RPC from a parent with streaming events via callback. Each child gets its own conversation tree, memory, tools, model.

Relationship to the existing Agents SDK

  • Think adds to the existing Agents SDK; nothing is deprecated.
  • The plain Agent base class remains usable on its own — all six primitives are consumable from Agent directly. Think is opinionated wiring.
  • Think speaks the same WebSocket protocol as @cloudflare/ai-chat. Existing AIChatAgent clients don't change.

Relationship to Agent Lee

Agent Lee (sources/2026-04-15-cloudflare-introducing-agent-lee) is the first-party customer-facing product Cloudflare runs on today's Agents SDK, at 18K DAU / 250K tool calls / day. Project Think is the next-generation platform Cloudflare is already using internally to build background-agent infrastructure. Treat the two posts as bracketing the agent posture: here's an agent we ran + here's the platform for you to run yours.

Three waves framing

"The first wave was chatbots." Stateless, reactive, fragile. "The second wave was coding agents." Stateful + tool-using but local + single-user + no durability. "Now we are entering the third wave: agents as infrastructure." Durable, distributed, structurally safe, serverless — enforce security through architecture, not behavior.

Project Think's substrate is the explicit bet on that third wave.

Caveats

  • Preview. "API surface is stable but will continue to evolve in the coming days and weeks." Names likely to change.
  • No production-scale numbers. Platform-primitives article, not a retrospective; no DAU / throughput / reliability figures like the same-day Agent Lee launch.
  • 99.9% token reduction is measured against the naive alternative (every endpoint as its own tool schema), not against a hand-crafted minimised tool surface. See patterns/tool-surface-minimization for the realistic baseline.
  • Fiber-vs-Workflow relationship implicit. Workflows remains the top-level orchestration tier; runFiber() appears to be the agent-loop-scoped durable-execution primitive co-located in the agent DO.
  • Runtime-npm resolution security posture undiscussed. LLM-written import statements + live registry fetch expose typosquatting / supply-chain surface the post doesn't cover.

Seen in

Last updated · 200 distilled / 1,178 read