CONCEPT Cited by 4 sources
One-to-one agent instance¶
Definition¶
The one-to-one agent instance is the structural observation that an AI agent, unlike a traditional request handler or web service, cannot sensibly multi-tenant: each session is its own task, with its own memory, tool history, credentials, and intermediate artifacts. "Agents are one-to-one." The deployment unit is one agent instance per (user, task, conversation), not one shared process serving many users.
"A restaurant has a menu and a kitchen optimized to churn out dishes at volume. An agent is more like a personal chef: different ingredients, different techniques, different tools every time." (Source: Cloudflare Project Think.)
Why it breaks traditional scaling math¶
Conventional web applications serve N users per instance where N is hundreds to millions. Horizontal scaling adds instances as peak concurrency grows but total instance count stays far below user count.
Agents invert the ratio: one instance per user, often per open task. At 100 M knowledge workers × modest concurrency that's tens of millions of simultaneous sessions — "at current per- container costs, that's unsustainable. We need a different foundation."
The post's worked example:
"10,000 agents, each active 1% of the time: VMs / Containers — 10,000 always-on instances. Durable Objects — ~100 active at any moment."
The active count stays small; the addressable count scales with users.
Required substrate properties¶
For one-to-one agents to be economic, the runtime must give:
- Per-agent isolation — memory, state, credentials don't leak across sessions. Agent-to-agent mischief is structurally impossible.
- Zero idle cost — hibernated agents don't bill. "$0 until the agent wakes up." Requires runtime-managed hibernation + wake-on-event semantics rather than keep-the-process-alive.
- Automatic scaling per-agent — no provisioning, no capacity pre-allocation, no sticky-session logic. Spawning agent N+1 is a function call, not a capacity plan.
- Built-in state — each agent has its own addressable identity + persistent store reachable without an external DB. Otherwise the 10,000-agents-per-user shape implodes into external-connection fan-out.
- Platform-managed recovery — when the host crashes, the platform restarts the agent with state intact. The application doesn't run its own process manager.
- Identity / routing —
name → agentis the API; no load balancers, no session affinity, no routing protocol.
The concepts/actor-model is the canonical fit. Project Think's substrate — Durable Objects — delivers all six properties:
| Property | Durable Objects |
|---|---|
| Idle cost | Zero (hibernated) |
| Scaling | Automatic, per-agent |
| State | Built-in SQLite |
| Recovery | Platform-restart, state survives |
| Identity / routing | name → agent, built-in |
| Concurrency | Single-writer per key, serialised |
Distinguishing from serverless functions¶
Serverless functions (AWS Lambda,
Workers) also
scale to zero and start per-request.
But a Lambda function has no durable identity — request N+1
may hit a different process with no memory of request N. An
actor-based agent runtime routes every message addressed to
"agent:alice-session-1234" to the same instance, wherever the
platform put it, with full state intact.
A one-to-one agent is Lambda + addressable identity + embedded state — that composition is an actor, not a serverless function.
Contrast with multi-user AI APIs¶
An LLM-inference endpoint (POST /v1/chat/completions) is a
classical high-concurrency web service — stateless, many users per
instance. The agent wrapping that endpoint is the one-to-one
unit; the inference call inside is a shared backend. The agent
holds memory, workspace, conversation tree; the inference endpoint
is stateless throughput.
Cost implications¶
- Serving cost scales with active concurrency, not user count — assuming true hibernation with per-wakeup billing.
- Storage cost scales with persistent state per agent — memory blocks, session trees, workspace files. SQLite per-agent means per-agent cost, not fleet cost.
- Developer cost: writing for the one-to-one shape requires getting comfortable with "talking to a specific agent instance" as a first-class primitive. Teams coming from stateless-RPC- thinking often under-use the addressability and over-use a shared DB.
Implications for architecture¶
- Credential scope — one agent holds the credentials for its one user. Cross-user confusion is impossible by construction.
- Personalization — persistent memory is the default, not the bolted-on feature; the agent's SQLite is the "user profile".
- Observability — per-agent traces replace per-request sampling. An agent has a history, not an event stream.
- Debugging — you open the agent, not the log. Deterministic single-writer execution reduces heisen-bugs.
- Failure isolation — one agent's crash doesn't take down others, unlike shared-process architectures.
Seen in¶
- sources/2026-04-15-cloudflare-project-think-building-the-next-generation-of-ai-agents — canonical wiki instance. Framed as the load-bearing scaling premise behind the entire Project Think substrate. "That fundamentally changes the scaling math." The 10K-agents-at-1% VM-vs-DO comparison table in the post quantifies the gap.
- sources/2026-04-16-cloudflare-artifacts-versioned-storage-that-speaks-git — extended to the storage tier: one DO per Git repo per agent session. Same "millions of instances, most idle, few active at once" economics applied to versioned storage. Cloudflare states this directly: "we want our pricing to work at agent-scale: it needs to be cost effective to have millions of repos, unused (or rarely used) repos shouldn't be a drag, and our pricing should match the massively-single-tenant nature of agents." See concepts/repo-per-agent-session, systems/cloudflare-artifacts.
- sources/2026-04-16-cloudflare-email-service-public-beta-ready-for-agents
— extended to the email-channel tier: one DO per
email-address-resolved agent instance. The address is the
DO-ID selector —
support@domain/support+ticket-123@domaineach route to a distinct DO. Same one-to-one shape applied to the email channel. See concepts/address-based-agent-routing - patterns/sub-addressed-agent-instance.
Related¶
- systems/project-think — Cloudflare's SDK built on this scaling premise.
- systems/cloudflare-durable-objects — the substrate that delivers the six required properties.
- concepts/actor-model — the programming-model primitive one- to-one agents specialise.
- concepts/scale-to-zero — the idle-cost property.
- concepts/serverless-compute — the sibling multi-user primitive; one-to-one agents are actors on top of it, not a replacement.
- concepts/task-and-actor-model — Ray's actor / task formulation; same distinction at the distributed-compute layer.
- concepts/stateless-compute — the complementary counter-property; one-to-one agents are explicitly not this.