SYSTEM Cited by 13 sources
Cloudflare Workers¶
Cloudflare Workers is Cloudflare's serverless JavaScript / WebAssembly compute tier, deployed across 330+ Cloudflare POPs. Workers runs user code inside V8 isolates — lightweight sub-process sandboxes that start in milliseconds — rather than per-request containers or VMs, which is what drives the fleet density and the Workers economics.
Isolate model¶
- One V8 isolate per Worker (per script, per account). Code lives loaded in the isolate; requests are dispatched to a warm isolate when one exists, minted on first use if not.
- Default isolate memory limit: 128 MB. (Compare to Lambda's 128 MB–10 GB range per function; Workers' density assumption is a lot of small isolates, not a few big ones.)
- CPU-time billing, not wall-clock. Time spent waiting on external I/O or on another request's CPU in the same isolate is not billed as CPU. Pricing: $0.072/hr globally (as of 2025-10-14).
Warm-isolate routing (2025-10 update)¶
Cloudflare's router prefers sending traffic to warm isolates of the requested Worker to minimize cold-start latency (the general shape of concepts/warm-isolate-routing). Prior to the 2025-10 revision the heuristic was optimized for I/O-heavy workloads at fleet scale.
When a burst of CPU-bound requests arrived at a single isolate
(the shape Theo Browne's cf-vs-vercel-bench benchmark generated),
later requests would queue behind the in-flight CPU-heavy one,
inflating client-observed latency. The heuristic then spun up new
isolates to compensate, but not as quickly as an all-CPU-bound
workload needed.
Post-fix (rolled out globally, 2025-10):
- Detects sustained CPU-heavy work earlier.
- Biases routing so new isolates spin up faster when CPU load is the bottleneck.
- I/O-bound workloads still coalesce onto warm isolates (the original optimization target).
- CPU-bound workloads are now spread so they don't block each other.
"I/O-bound workloads coalesce into individual already warm isolates while CPU-bound are directed so that they do not block each other." (Source: sources/2025-10-14-cloudflare-unpacking-cloudflare-workers-cpu-performance-benchmarks)
V8 GC tuning (2025-10 update)¶
Workers had a manually-tuned V8 young-generation cap dating from June 2017 (when the project was two months old) based on V8's then-guidance for ≤ 512 MB environments. V8's GC has changed dramatically since then, and the cap was making GC work harder / more frequently than necessary for 2025-era V8.
Fix: removed the manual tuning and let V8 pick young-space size via its own internal heuristics.
- ~25 % improvement on the benchmark workload.
- Small memory-usage uplift.
- All Workers benefit, not just the benchmarked one — though for most Workers the improvement is much smaller.
See concepts/v8-young-generation for the underlying knob.
Framework / adapter integration¶
Workers is a plausible target for many JS frameworks via OpenNext (Next.js), Remix adapters, and direct Workers-API code. The 2025-10-14 benchmark retrospective surfaced a long tail of allocation / buffering inefficiencies in the OpenNext Cloudflare adapter and in the upstream Next.js / React render pipeline — most of which Cloudflare began chipping at via upstream PRs (see sources/2025-10-14-cloudflare-unpacking-cloudflare-workers-cpu-performance-benchmarks).
Node.js compatibility¶
Workers has evolved from a partial / shim-heavy Node.js environment
to broad native support
for node:* built-ins. Two consequences on the wiki:
- Playwright on Workers: the
Browser Rendering
Playwright adapter historically had to mock the filesystem using
memfs, drifting from upstream. Nativenode:fsreplaced the mock, letting the adapter track upstream with fewer patches. - NPM compat at scale: Cloudflare ran an experiment against the top 1,000 NPM packages; after filtering out build/CLI/browser- only packages, only 15 (1.5%) genuinely fail to run on Workers natively. Results page: worksonworkers.southpolesteve.workers.dev. This compatibility is what enables Workers as a first-class middleware / integration runtime for stacks that were built against Node.
Workers as middleware adapter¶
An increasingly common Cloudflare-architected shape is the Worker as middleware adapter: a thin entrypoint Worker owns auth, routing, secrets injection, and platform-API shape-translation for an unmodified application running behind it (in a container, a third-party service, or a separate Workers project). Node.js compatibility + Sandbox SDK + AI Gateway + R2 + Zero Trust Access compose into a standard stack for this shape — see Moltworker and internal AI engineering stack.
Streams: Workers-specific escape hatches¶
Workers implements Web Streams as
its primary streaming API (Node streams work via the
nodejs_compat layer). Making Web streams fast at Workers scale
has required runtime-specific, spec-adjacent optimizations:
IdentityTransformStream— Workers-specific fast path for pass-through transforms that bypasses most of the TransformStream controller/queue machinery.tee()shared-buffer implementation — Workers diverges from the naive spec implementation (two branches → linked-list per-branch buffers with O(n) memory growth). Workers uses a shared buffer model where backpressure is signaled by the slowest consumer rather than the fastest — avoiding the memory cliff the spec default would produce.- Internal pipeline promise elision — Snell's internal fix to one Workers data pipeline reduced the number of JS promises created by up to 200×, yielding "several orders of magnitude improvement in performance".
These optimizations land in the "non-observable" parts of the spec — implementations are free to differ as long as observable behaviour matches. The 2026-02-27 post argues this is unsustainable across runtimes: "When you find yourself needing to relax or bypass spec semantics just to achieve reasonable performance, that's a sign something is wrong with the spec itself." See sources/2026-02-27-cloudflare-a-better-streams-api-is-possible-for-javascript for the full critique and systems/new-streams for Snell's proposed alternative.
Scale / positioning¶
- 330+ cities worldwide (vs centralised compute platforms that place in a few US-east / US-west regions).
- Shared runtime with Cloudflare's broader edge stack (Pingora proxy fabric, WAF, AI Gateway).
- Sister tier: Dynamic Workers for per-request isolate-spawning (agent-generated code, Sandbox SDK) — distinct from the general Workers tier described here.
Seen in¶
- sources/2025-10-14-cloudflare-unpacking-cloudflare-workers-cpu-performance-benchmarks — canonical wiki instance of Workers' isolate-routing + V8-GC tuning surfaces (the two knobs Cloudflare re-tuned post- benchmark) plus the OpenNext / Next.js / React adapter-layer inefficiencies the benchmark exposed.
- sources/2026-01-29-cloudflare-moltworker-self-hosted-ai-agent
— Workers as middleware-adapter runtime. The Moltworker
entrypoint Worker fronts a
Sandbox SDK-hosted
Moltbot container and proxies LLM calls
through AI Gateway, browser
calls through Browser
Rendering, filesystem through R2, and
auth through Zero Trust
Access. Also the NPM-compat-1.5%-fail claim and the
memfs → node:fsPlaywright migration are from this post. - sources/2026-04-20-cloudflare-internal-ai-engineering-stack — Workers hosting Cloudflare's internal AI platform-layer proxy at enterprise scale (20.18M AI Gateway req/month).
- sources/2026-02-27-cloudflare-a-better-streams-api-is-possible-for-javascript
— Workers' Web-streams optimization posture (
IdentityTransformStream, shared-buffertee(), internal 200× promise reduction) cited as the canonical "every runtime invents its own escape hatches" illustration of why Web streams' current design is unsustainable; Workers as one of four benchmark targets for thenew-streamsPOC. - sources/2026-04-16-cloudflare-ai-search-the-search-primitive-for-your-agents
— Workers as the host for the
ai_search_namespacesbinding, the primary consumption surface for AI Search. The support-agent example runs entirely in a Worker (with a DO-backedAIChatAgent+ bindings for AI Search, Workers AI, and Durable Objects), and the new binding completes thewrangler.jsoncbindings family (ai_search_namespacesalongsideai,durable_objects, KV, R2, D1, etc.). - sources/2026-04-16-cloudflare-email-service-public-beta-ready-for-agents
— Workers as the host for the new
EMAILbinding (env.EMAIL.send({ to, from, subject, text })) +email(message, env)default-export handler for inbound. The binding slots into the samewrangler.jsoncbindings family asai,durable_objects, andai_search_namespaces. Sibling CLI surface:wrangler email send. Workers also host the Agentic Inbox reference app that stitches Email Routing + Email Sending + Workers AI + R2 + Agents SDK. - sources/2026-04-16-cloudflare-deploy-postgres-and-mysql-databases-with-planetscale-workers
— Workers as the consumer of the Hyperdrive
binding against PlanetScale Postgres /
MySQL. Two new configuration knobs land on
wrangler.jsonchere:hyperdrive: [{ binding, id }](connectivity / pooling / caching against the partner DB) andplacement: { region: "aws:us-east-1" }(the explicit placement hint that pins Worker execution to a POP co-located with the DB region, resolving the edge-to-origin-DB latency hazard). Canonical wiki instance of patterns/partner-managed-service-as-native-binding on the Workers platform. - sources/2026-02-24-cloudflare-how-we-rebuilt-nextjs-with-ai-in-one-week
— Workers as the primary deployment target for
vinext, the clean
reimplementation
of the Next.js API surface on
Vite.
vinext deployauto-generates Worker config. Both App Router and Pages Router work on Workers with full client-side hydration, interactive components, client-side navigation. Since vinext runs insideworkerdduring both dev and deploy, platform-specific APIs (Durable Objects, AI bindings, KV) are usable without thegetPlatformProxyworkaroundsnext devhistorically required. vinext also introduces Traffic-aware Pre-Rendering, which queries zone analytics at deploy time and pre-renders the top-traffic URLs to KV viaKVCacheHandler— a CDN-native capability unavailable to offline build tools. - sources/2026-04-01-cloudflare-emdash-wordpress-spiritual-successor — Workers as the serverless runtime for EmDash, Cloudflare's new open-source CMS. EmDash is designed to make the most out of the V8-isolate architecture of workerd: "instantly spins up an isolate to execute code and serve a response. It scales back down to zero if there are no requests. And it only bills for CPU time." Canonical wiki instance of Workers as a "CMS host" — WordPress-scale workload reframed as serverless. EmDash plugins run as Dynamic Workers with capability manifests, extending Workers' role from "your serverless function runtime" to "your sandboxed-plugin runtime". Cloudflare for Platforms supports millions of EmDash instances on shared Workers infra.
- sources/2026-04-17-cloudflare-introducing-flagship-feature-flags-built-for-the-age-of-ai
— Workers isolate as the feature-
flag evaluation runtime (
Flagship, private beta 2026-04-17). Binding config
flagship: [{ binding, app_id }]onwrangler.jsoncexposesenv.FLAGS.getBooleanValue/getStringValue/ getNumberValue/getObjectValue(key, default, context)+*Details()variants; "no HTTP round-trip, no SDK overhead". Canonical wiki motivation for why patterns/in-isolate-rule-evaluation is the right shape on Workers, not long-lived SDK rules caches: "On Workers, none of these assumptions hold. There is no long-lived process: a Worker isolate can be created, serve a request, and be evicted between one request and the next. A new invocation could mean re-initializing the SDK from scratch." The rule-evaluation engine runs inside the same V8 isolate serving the user request, reading flag config from edge-local KV, matching context against rules (AND/OR nested up to 5 levels, priority-ordered first-match wins), resolving any percentage rollout via consistent hashing, and returning a variation. Failure model: evaluation errors return the caller-supplied default; type mismatches throw ("that's a bug in your code, not a transient failure") — load-bearing for JSON/object flag variations which would silently corrupt downstream if coerced. Sixth 2026-04 Cloudflare launch whose load- bearing architectural primitive is one Durable Object per caller-identified unit (alongside Agent Lee, Project Think, AI Search, Artifacts, and Email Service) — Flagship is specifically the first to pair the per-app DO with globally-replicated KV for edge-local evaluation reads (patterns/do-plus-kv-edge-config-distribution).
Related¶
- systems/v8-javascript-engine — the JS engine Workers embeds.
- systems/nodejs — the sibling V8-embedder; Workers exposes
Node.js API compatibility via
nodejs_compat. - systems/web-streams-api — Workers' primary streaming API, optimized with runtime-specific escape hatches.
- systems/new-streams — the POC alternative by a core Workers runtime engineer (Snell).
- systems/opennext — the Next.js adapter that runs on Workers.
- systems/cloudflare-ai-gateway — LLM-proxy tier commonly fronted by Workers.
- systems/cloudflare-sandbox-sdk — container-lifecycle tier Workers orchestrate.
- systems/cloudflare-r2 — object storage commonly written from Workers.
- systems/cloudflare-zero-trust-access — auth tier in front of Workers.
- concepts/warm-isolate-routing — the routing heuristic tuning Cloudflare described.
- concepts/v8-young-generation — the GC knob Cloudflare un-tuned.
- concepts/promise-allocation-overhead — the dominant Web-streams cost Workers fights via runtime-specific fast paths.
- concepts/cold-start — the latency class warm-isolate routing is designed to avoid.
- concepts/serverless-compute — the category Workers realizes with an isolate-per-Worker twist.
- patterns/middleware-worker-adapter — the architectural shape Workers now commonly realises for composite Cloudflare-stack applications.
- systems/cloudflare-waf, systems/pingora-origin — adjacent Cloudflare edge services.
CLI + local-dev surface (2026-04-13)¶
Cloudflare Workers' developer experience is being unified around
cf — the next-generation
Wrangler CLI — plus
Local Explorer for
introspecting Miniflare's local state. Every Worker binding
(KV, R2,
D1,
Durable Objects, Workflows) is
now exposed through a local mirror of the Cloudflare API at
/cdn-cgi/explorer/api, giving agents and developers
local-remote parity on every
Worker-bound resource (Source:
sources/2026-04-13-cloudflare-building-a-cli-for-all-of-cloudflare).
As the Git-server front-end (Artifacts, 2026-04-16)¶
In Artifacts, the Worker is the
stateless front-end of a Git server: handles authentication
+ authorization (auth token looked up in
KV), key metrics (error rate, latency),
and DO routing — mapping the Git-remote URL path to the
correct per-repo Durable
Object instance, which hosts the
~100 KB pure-Zig Wasm Git server and
embedded SQLite storage. The Worker returns a
ReadableStream<Uint8Array> built directly from the raw WASM
output chunks, keeping the ~128 MB DO memory envelope tight.
Canonical wiki instance of patterns/do-backed-git-server —
and a new env.AGENT_REPOS.create() /
env.ARTIFACTS.import() / repo.fork() Worker binding for
programmatic repo creation from any Worker (Source:
sources/2026-04-16-cloudflare-artifacts-versioned-storage-that-speaks-git).
As the AI inference call-site (AI Platform, 2026-04-16)¶
The 2026-04-16 AI Platform post repositions the env.AI.run()
binding from "the Workers AI binding, scoped to @cf/…
models" to "the unified inference binding — same call shape
for any model from any provider, plus customer-pushed
Cog containers" (Source:
sources/2026-04-16-cloudflare-ai-platform-an-inference-layer-designed-for-agents).
// One line to swap provider:
await env.AI.run("@cf/moonshotai/kimi-k2.5", { prompt }, {
gateway: { id: "default" },
metadata: { teamId: "AI", userId: 12345 },
});
await env.AI.run("anthropic/claude-opus-4-6", { input },
{ gateway: { id: "default" } });
Provider selector lives in the model-string prefix
(@cf/... / anthropic/... / openai/...), not in the
binding. Canonical Worker instance of
patterns/unified-inference-binding and
concepts/unified-model-catalog. Same metadata: {...} field
on every call feeds per-request cost attribution on
AI Gateway. For agents, pairs
with AI Gateway's buffered-resumable-stream guarantee
(concepts/resilient-inference-stream) so a mid-turn Worker
crash doesn't lose the in-flight inference response.