Skip to content

SYSTEM Cited by 13 sources

Cloudflare Workers

Cloudflare Workers is Cloudflare's serverless JavaScript / WebAssembly compute tier, deployed across 330+ Cloudflare POPs. Workers runs user code inside V8 isolates — lightweight sub-process sandboxes that start in milliseconds — rather than per-request containers or VMs, which is what drives the fleet density and the Workers economics.

Isolate model

  • One V8 isolate per Worker (per script, per account). Code lives loaded in the isolate; requests are dispatched to a warm isolate when one exists, minted on first use if not.
  • Default isolate memory limit: 128 MB. (Compare to Lambda's 128 MB–10 GB range per function; Workers' density assumption is a lot of small isolates, not a few big ones.)
  • CPU-time billing, not wall-clock. Time spent waiting on external I/O or on another request's CPU in the same isolate is not billed as CPU. Pricing: $0.072/hr globally (as of 2025-10-14).

Warm-isolate routing (2025-10 update)

Cloudflare's router prefers sending traffic to warm isolates of the requested Worker to minimize cold-start latency (the general shape of concepts/warm-isolate-routing). Prior to the 2025-10 revision the heuristic was optimized for I/O-heavy workloads at fleet scale.

When a burst of CPU-bound requests arrived at a single isolate (the shape Theo Browne's cf-vs-vercel-bench benchmark generated), later requests would queue behind the in-flight CPU-heavy one, inflating client-observed latency. The heuristic then spun up new isolates to compensate, but not as quickly as an all-CPU-bound workload needed.

Post-fix (rolled out globally, 2025-10):

  • Detects sustained CPU-heavy work earlier.
  • Biases routing so new isolates spin up faster when CPU load is the bottleneck.
  • I/O-bound workloads still coalesce onto warm isolates (the original optimization target).
  • CPU-bound workloads are now spread so they don't block each other.

"I/O-bound workloads coalesce into individual already warm isolates while CPU-bound are directed so that they do not block each other." (Source: sources/2025-10-14-cloudflare-unpacking-cloudflare-workers-cpu-performance-benchmarks)

V8 GC tuning (2025-10 update)

Workers had a manually-tuned V8 young-generation cap dating from June 2017 (when the project was two months old) based on V8's then-guidance for ≤ 512 MB environments. V8's GC has changed dramatically since then, and the cap was making GC work harder / more frequently than necessary for 2025-era V8.

Fix: removed the manual tuning and let V8 pick young-space size via its own internal heuristics.

  • ~25 % improvement on the benchmark workload.
  • Small memory-usage uplift.
  • All Workers benefit, not just the benchmarked one — though for most Workers the improvement is much smaller.

See concepts/v8-young-generation for the underlying knob.

Framework / adapter integration

Workers is a plausible target for many JS frameworks via OpenNext (Next.js), Remix adapters, and direct Workers-API code. The 2025-10-14 benchmark retrospective surfaced a long tail of allocation / buffering inefficiencies in the OpenNext Cloudflare adapter and in the upstream Next.js / React render pipeline — most of which Cloudflare began chipping at via upstream PRs (see sources/2025-10-14-cloudflare-unpacking-cloudflare-workers-cpu-performance-benchmarks).

Node.js compatibility

Workers has evolved from a partial / shim-heavy Node.js environment to broad native support for node:* built-ins. Two consequences on the wiki:

  • Playwright on Workers: the Browser Rendering Playwright adapter historically had to mock the filesystem using memfs, drifting from upstream. Native node:fs replaced the mock, letting the adapter track upstream with fewer patches.
  • NPM compat at scale: Cloudflare ran an experiment against the top 1,000 NPM packages; after filtering out build/CLI/browser- only packages, only 15 (1.5%) genuinely fail to run on Workers natively. Results page: worksonworkers.southpolesteve.workers.dev. This compatibility is what enables Workers as a first-class middleware / integration runtime for stacks that were built against Node.

Workers as middleware adapter

An increasingly common Cloudflare-architected shape is the Worker as middleware adapter: a thin entrypoint Worker owns auth, routing, secrets injection, and platform-API shape-translation for an unmodified application running behind it (in a container, a third-party service, or a separate Workers project). Node.js compatibility + Sandbox SDK + AI Gateway + R2 + Zero Trust Access compose into a standard stack for this shape — see Moltworker and internal AI engineering stack.

Streams: Workers-specific escape hatches

Workers implements Web Streams as its primary streaming API (Node streams work via the nodejs_compat layer). Making Web streams fast at Workers scale has required runtime-specific, spec-adjacent optimizations:

  • IdentityTransformStream — Workers-specific fast path for pass-through transforms that bypasses most of the TransformStream controller/queue machinery.
  • tee() shared-buffer implementation — Workers diverges from the naive spec implementation (two branches → linked-list per-branch buffers with O(n) memory growth). Workers uses a shared buffer model where backpressure is signaled by the slowest consumer rather than the fastest — avoiding the memory cliff the spec default would produce.
  • Internal pipeline promise elision — Snell's internal fix to one Workers data pipeline reduced the number of JS promises created by up to 200×, yielding "several orders of magnitude improvement in performance".

These optimizations land in the "non-observable" parts of the spec — implementations are free to differ as long as observable behaviour matches. The 2026-02-27 post argues this is unsustainable across runtimes: "When you find yourself needing to relax or bypass spec semantics just to achieve reasonable performance, that's a sign something is wrong with the spec itself." See sources/2026-02-27-cloudflare-a-better-streams-api-is-possible-for-javascript for the full critique and systems/new-streams for Snell's proposed alternative.

Scale / positioning

  • 330+ cities worldwide (vs centralised compute platforms that place in a few US-east / US-west regions).
  • Shared runtime with Cloudflare's broader edge stack (Pingora proxy fabric, WAF, AI Gateway).
  • Sister tier: Dynamic Workers for per-request isolate-spawning (agent-generated code, Sandbox SDK) — distinct from the general Workers tier described here.

Seen in

CLI + local-dev surface (2026-04-13)

Cloudflare Workers' developer experience is being unified around cf — the next-generation Wrangler CLI — plus Local Explorer for introspecting Miniflare's local state. Every Worker binding (KV, R2, D1, Durable Objects, Workflows) is now exposed through a local mirror of the Cloudflare API at /cdn-cgi/explorer/api, giving agents and developers local-remote parity on every Worker-bound resource (Source: sources/2026-04-13-cloudflare-building-a-cli-for-all-of-cloudflare).

As the Git-server front-end (Artifacts, 2026-04-16)

In Artifacts, the Worker is the stateless front-end of a Git server: handles authentication + authorization (auth token looked up in KV), key metrics (error rate, latency), and DO routing — mapping the Git-remote URL path to the correct per-repo Durable Object instance, which hosts the ~100 KB pure-Zig Wasm Git server and embedded SQLite storage. The Worker returns a ReadableStream<Uint8Array> built directly from the raw WASM output chunks, keeping the ~128 MB DO memory envelope tight. Canonical wiki instance of patterns/do-backed-git-server — and a new env.AGENT_REPOS.create() / env.ARTIFACTS.import() / repo.fork() Worker binding for programmatic repo creation from any Worker (Source: sources/2026-04-16-cloudflare-artifacts-versioned-storage-that-speaks-git).

As the AI inference call-site (AI Platform, 2026-04-16)

The 2026-04-16 AI Platform post repositions the env.AI.run() binding from "the Workers AI binding, scoped to @cf/… models" to "the unified inference binding — same call shape for any model from any provider, plus customer-pushed Cog containers" (Source: sources/2026-04-16-cloudflare-ai-platform-an-inference-layer-designed-for-agents).

// One line to swap provider:
await env.AI.run("@cf/moonshotai/kimi-k2.5", { prompt }, {
  gateway: { id: "default" },
  metadata: { teamId: "AI", userId: 12345 },
});

await env.AI.run("anthropic/claude-opus-4-6", { input },
                 { gateway: { id: "default" } });

Provider selector lives in the model-string prefix (@cf/... / anthropic/... / openai/...), not in the binding. Canonical Worker instance of patterns/unified-inference-binding and concepts/unified-model-catalog. Same metadata: {...} field on every call feeds per-request cost attribution on AI Gateway. For agents, pairs with AI Gateway's buffered-resumable-stream guarantee (concepts/resilient-inference-stream) so a mid-turn Worker crash doesn't lose the in-flight inference response.

Last updated · 200 distilled / 1,178 read