Skip to content

CLOUDFLARE 2026-02-27

Read original ↗

We deserve a better streams API for JavaScript

Summary

James Snell — Cloudflare Workers runtime engineer, Node.js TSC member, multi-runtime implementer of the WHATWG Streams Standard ("Web streams") — argues that Web streams' usability and performance problems cannot be fixed with incremental improvements; they are consequences of design decisions made 2014-2016, before for await…of landed in ES2018. The post enumerates fundamental issues (excessive ceremony, locks-as-footguns, BYOB complexity without payoff, advisory-only backpressure, per-operation promise overhead, unconsumed-body connection leaks, unbounded tee() buffering, transform-pipeline back-pressure gaps, GC thrashing in streaming SSR) and presents a proof-of-concept alternative (new-streams) built around async iterables, pull-based evaluation, explicit backpressure policies (strict / block / drop-oldest / drop-newest), batched Uint8Array[] chunks, and complete parallel synchronous fast paths. Benchmarks show 2×–120× faster than Web streams across every runtime tested (Workers, Node.js, Deno, Bun, browsers). The post is explicitly framed as a conversation-starter, not a ship-it standard.

Key takeaways

  1. The Streams Standard predates async iteration. Designed 2014-2016; for await…of landed ES2018 (two years after the spec was finalized). Every design choice — explicit reader/writer acquisition, { value, done } protocol, locking — "rippled through every aspect of the API" because the idiomatic JS way to consume asynchronous sequences did not yet exist. (Source: article §"The Streams Standard was developed between 2014 and 2016".)

  2. Locks are a footgun. getReader() locks the stream; forgetting releaseLock() permanently breaks it. The locked property tells you that a stream is locked, not why, by whom, or whether the lock is even still usable. Async iteration hides this — until something goes wrong, at which point developers land back in "readers, locks, and controllers" they didn't want to know about. (Source: article §"Web streams use a locking model".)

  3. BYOB reads ship complexity without measurable payoff. Require a separate reader type, separate controller, ArrayBuffer detachment semantics, and a full branch of the Web Platform Tests. Most userland ReadableStream implementations don't bother; most consumers take the default-read path; BYOB can't be used with for await…of or TransformStream.

  4. Web streams backpressure is advisory-only. controller.desiredSize goes negative but controller.enqueue() always succeeds. tee() branches buffer without limit. writer.ready exists but producers routinely ignore it. "Stream implementations can and do ignore backpressure; and some spec-defined features explicitly break backpressure." Contrast with the alternative API's four explicit policies (strict / block / drop-oldest / drop-newest) where the choice is required, not hoped-for. (Source: article §"Backpressure: good in theory, broken in practice".)

  5. Per-operation promise allocation is the 10-25× performance cliff. Every read() returns a promise; internally, the spec mandates additional promises for queue management, pull() coordination, and backpressure signaling. Vercel's independent Ralph Wiggum'd WebStreams measured ReadableStream.pipeThrough() at 630 MB/s vs Node pipeline() at ~7,900 MB/s — a 12× gap almost entirely promise and object allocation overhead. Snell's own Workers fix to an internal data pipeline reduced JS promises created by up to 200×, yielding "several orders of magnitude improvement". (Source: article §"The hidden cost of promises".)

  6. Unconsumed fetch() bodies leak connections. The body is a ReadableStream; if you only check response.ok and don't consume or cancel, the stream holds a reference to the underlying connection until GC runs. Under load this exhausts connection pools — a real production bug fixed in Node's undici. Request.clone() / Response.clone() compound this with implicit tee() operations. (Source: article §"Exhausting resources with unconsumed bodies" + quoted Matteo Collina, Node.js TSC Chair.)

  7. tee() has a memory cliff. If one branch reads faster than the other, the spec-described implementation buffers unboundedly until the slow branch catches up. Firefox initially used a linked-list approach (O(n) memory growth); Cloudflare Workers opted for a shared buffer model where backpressure is signaled by the slowest consumer rather than the fastest — a runtime-specific divergence from the spec's default implementation shape. (Source: article §"Falling headlong off the tee() memory cliff".)

  8. TransformStream is push-based and backpressure-leaky. transform() runs eagerly on write, regardless of whether any consumer is pulling. Synchronous-and-enqueue transforms never apply back-pressure upstream even when the downstream reader is slow, so a 3-stage pipeline can fill six internal buffers before the consumer starts reading. "Under load this creates GC pressure that can devastate throughput. […] up to and beyond 50% of total CPU time per request" in streaming SSR. (Source: article §"Transform backpressure gaps" + §"GC thrashing in server-side rendering".)

  9. The optimization treadmill is unsustainable. Every major runtime (Node.js, Deno, Bun, Workers) has invented non-standard internal escape hatches to make Web streams fast: Bun's "Direct Streams", Workers' IdentityTransformStream, Deno native paths, Vercel's proposed "fast-webstreams". These "work in some scenarios but not in others, in some runtimes but not others […] creates friction for developers trying to write cross-runtime code." A well-designed streaming API should be efficient by default, not require each runtime to invent its own bypasses. (Source: article §"The optimization treadmill".)

  10. The alternative posture — a new-streams POC (github.com/jasnell/new-streams) — is built around six different foundations:

    • Readable is just AsyncIterable<Uint8Array[]>. No custom class, no getReader(), no locks.
    • Pull-based, lazy evaluation. Transforms don't execute until the consumer iterates.
    • Explicit backpressure policies at creation time, default strict.
    • Batched chunks (Uint8Array[] per yield) amortize async overhead.
    • Writers are structural — any { write, end, abort } object, no class hierarchy.
    • Parallel synchronous APIs (Stream.pullSync, Stream.bytesSync, Stream.textSync) skip promises entirely when source + transforms are all sync.

Architectural numbers

Scenario (Node.js v24, Apple M1 Pro) new-streams Web streams Ratio
Small chunks (1 KB × 5000) ~13 GB/s ~4 GB/s ~3×
Tiny chunks (100 B × 10 000) ~4 GB/s ~450 MB/s ~8×
Async iteration (8 KB × 1000) ~530 GB/s ~35 GB/s ~15×
Chained 3× transforms (8 KB × 500) ~275 GB/s ~3 GB/s ~80-90×
High-frequency (64 B × 20 000) ~7.5 GB/s ~280 MB/s ~25×
Chrome/Blink (3-run average) new-streams Web streams Ratio
Push 3 KB chunks ~135k ops/s ~24k ops/s ~5-6×
Push 100 KB chunks ~24k ops/s ~3k ops/s ~7-8×
3-transform chain ~4.6k ops/s ~880 ops/s ~5×
5-transform chain ~2.4k ops/s ~550 ops/s ~4×
bytes() consumption ~73k ops/s ~11k ops/s ~6-7×
Async iteration ~1.1M ops/s ~10k ops/s ~40-100×

Third-party corroboration: Vercel's pipeThrough() measurement 630 MB/s (Web streams) vs 7,900 MB/s (Node.js pipeline()) = 12× gap, attributed "almost entirely [to] Promise and object allocation overhead". Snell's internal Workers fix: up to 200× fewer promises created.

Testimonial from Robert Nagy (Node.js TSC, Node streams contributor): "there's something uniquely powerful about starting from scratch. New streams' approach embraces modern runtime realities without legacy baggage."

Systems / concepts extracted

Systems

  • Web Streams API — the WHATWG Streams Standard (ReadableStream, WritableStream, TransformStream, ReadableStreamBYOBReader). This article is the canonical wiki critique.
  • new-streams — jasnell's proof-of-concept alternative (github.com/jasnell/new-streams) built on async iterables
  • pull semantics + explicit backpressure policies.
  • Cloudflare Workers — one of four runtimes tested; context for where Snell's IdentityTransformStream and other Workers-specific optimizations live.
  • Node.js — tested runtime + target of Vercel's 10× fast-webstreams proposal; also the runtime where undici's unconsumed-body connection leak was observed and fixed.
  • V8 — shared engine across Workers / Node / Chrome / Deno; its promise / microtask machinery is the cost substrate.
  • OpenNext — the Next.js portability adapter whose 2025-10 profiling (see sibling source 2025-10-14) surfaced the Node-⇆-Web-stream double-buffer pathology this article generalizes.

Concepts

  • Backpressure — the slow-consumer-signals-fast-producer control primitive; this article's most load-bearing critique is that Web streams' backpressure is advisory-only.
  • Async iteration — the for await…of protocol that landed ES2018; the alternative API's foundation.
  • Pull vs push streams — consumer-demand-driven vs producer-driven evaluation. Web streams are push-eager; new-streams is pull-lazy.
  • Promise allocation overhead — the hidden per-call GC / microtask / object cost of a promise-heavy API; the core performance thesis.
  • BYOB (bring-your-own-buffer) reads — the zero-copy-via-transferred- buffer read path; the article argues its complexity exceeds its payoff.
  • Stream adapter overhead — existing concept; this article adds the async-iteration-bridge direction as a sibling adapter surface.
  • Hot path — existing concept; this article adds streaming-SSR rendering as the canonical JS-runtime instance (50 %+ GC CPU per request).
  • Garbage collection (as pressure source) — short-lived object allocation in hot paths; this article pins 50 %+ CPU to GC in badly-streamed SSR workloads.

Patterns

  • Explicit backpressure policy — pick strict / block / drop-oldest / drop-newest at stream-creation time, required, no silent default. Contrast with Web streams' advisory-desiredSize / hope-the-producer-checks model.
  • Lazy pull pipeline — pipeline stages execute only when the consumer iterates. Stopping iteration stops processing; no hidden background pumping; no intermediate-buffer cascade.
  • Upstream the fix — existing Cloudflare pattern; this article adds a new instance (Snell collaborating with Vercel's Malte Ubl on landing fast-webstreams improvements into Node.js — "as one of the core maintainers of Node.js, I am looking forward to helping Malte and the folks at Vercel get their proposed improvements landed").

Caveats

  • This is a conversation-starter, not a ship-it proposal. Snell is explicit: "I'm not here to disparage the work that came before; I'm here to start a conversation about what can potentially come next." new-streams is a POC, not a finished standard, not production-ready, not even necessarily the right concrete design.

  • Benchmarks compare a pure-TS/JS POC against native (JS/C++/Rust) Web streams implementations in each runtime. The new-streams numbers come entirely from design choices; a native implementation would likely go further. Conversely, Node.js has not yet invested significantly in Web streams performance — "there's likely significant room for improvement" once Vercel's proposed optimizations land. The gap will narrow; it probably will not close.

  • Web streams has valid legitimate uses — cross-security-boundary composition in browsers, airtight cancellation semantics, piping across untrusted boundaries. Malte Ubl's quote: "These guarantees matter in the browser where streams cross security boundaries […] But on the server, when you are piping React Server Components through three transforms at 1KB chunks, the cost adds up." The critique is sharpest on the server side.

  • The tee() memory cliff is partly an implementation choice. Cloudflare Workers' shared-buffer approach (backpressure signaled by slowest consumer) is compliant; the spec "allows implementations to implement […] in any way they see fit so long as the observable normative requirements are met". The pathology is "if an implementation chooses to implement tee() in the specific way described by the streams specification".

  • Author perspective: Snell is both (a) a core maintainer of Node.js, (b) the Workers runtime engineer who implemented Web streams in Workers, and (c) the author of new-streams. He has strong implementer-side priors. The post contains one quoted third-party (Matteo Collina, Node.js TSC Chair) corroborating the clone-implies-tee footgun, but the performance comparison is self-conducted.

Source

Last updated · 200 distilled / 1,178 read