Skip to content

PATTERN Cited by 1 source

Lazy pull pipeline

Intent

Compose streaming pipeline stages so that no stage executes until the consumer pulls — every transform, every source read, every allocation happens on-demand in consumer-driven order. Stopping the consumer stops the pipeline; no stage keeps running in the background.

This is the evaluation model of Unix pipes, reactive streams' "cold observables", and async iteration over generator chains. It is the alternative to the push-based, eagerly-pumping model of Web Streams's pipeThrough.

The shape

// new-streams: Stream.pull() composes transforms lazily
const output = Stream.pull(source, compress, encrypt);

// Nothing runs yet — output is just an async iterable

for await (const chunks of output) {
  // Each iteration pulls one batch through the pipeline:
  // - source produces next chunks
  // - compress transforms them
  // - encrypt transforms them
  // - consumer receives
  process(chunks);
}

// If we break here, compress and encrypt both stop.
// No background pumping. No half-filled buffers.

Equivalent async-generator form:

async function* pipeline(source) {
  for await (const chunk of source) {
    yield await encrypt(await compress(chunk));
  }
}
// Consumer drives:
for await (const c of pipeline(readFile())) {  }

Why it matters

1. Intermediate buffers don't cascade

Push-based pipelines cascade data forward through intermediate buffers before the consumer starts reading. A 3-transform chain with default high-water marks can accumulate six internal buffers simultaneously — three writable sides and three readable sides all filling in parallel.

Lazy pull pipelines have at most one pending value per stage — the one being pulled right now. Memory grows linearly with pipeline depth, not with pipeline depth × buffer size.

2. Cancellation is free

If the consumer stops iterating (throws, breaks, or the network disconnects), the producer's yield never returns. All upstream stages suspend. No explicit reader.cancel() call; no lock to release; no orphaned promises.

3. Resource lifetime matches consumption

fetch() response bodies held by the pipeline are released when iteration stops. Connection-pool leaks like the Node undici bug — "the stream holds a reference to the underlying connection until garbage collection runs" — don't happen in lazy-pull designs because the pipeline stops holding the reference the moment the consumer stops pulling.

4. Backpressure is implicit

Pull pipelines have concepts/backpressure as a consequence of the evaluation model, not as a separate mechanism. If the consumer is slow, next() isn't called; the producer's yield blocks; upstream stages block. There is no desiredSize for a producer to forget to check.

5. Performance compounds

The 2026-02-27 Cloudflare benchmark of a chained 3× transform pipeline measured ~80-90× faster for a lazy-pull design vs Web streams' pipeThrough():

"Pull-through semantics eliminate the intermediate buffering that plagues Web streams pipelines. Instead of each TransformStream eagerly filling its internal buffers, data flows on-demand from consumer to source."

Stateless vs stateful stages

Stateless transforms are plain functions:

const toUpperCase = (chunks) => {
  if (chunks === null) return null;   // flush sentinel
  return chunks.map(c => upperCase(c));
};

const output = Stream.pull(source, toUpperCase);

Stateful transforms wrap the source as an async generator:

function createLineParser() {
  return {
    async *transform(source) {
      let pending = new Uint8Array(0);
      for await (const chunks of source) {
        if (chunks === null) {
          if (pending.length > 0) yield [pending];
          continue;
        }
        // …split on newlines, preserve trailing partial…
        yield completedLines;
      }
    },
  };
}

Both compose into the same Stream.pull() pipeline. The transform contract is minimal — async *transform(source) and optional abort(reason) for cleanup — not a class with start() / transform() / flush() + a controller.

Contrast with eager push

The same pipeline, Web streams push-style:

source
  .pipeThrough(parseTransform)       // starts pumping immediately
  .pipeThrough(transformTransform)   // starts pumping immediately
  .pipeThrough(serializeTransform)   // starts pumping immediately
  .pipeTo(destination);              // consumer; kicks off the chain

Each pipeThrough() eagerly pumps bytes from its upstream source into its own internal buffer. By the time pipeTo() starts consuming, all three transforms may already be fully engaged. The consumer cannot pause the pipeline meaningfully — it can only fail to pull from the final buffer, while the upstream transforms continue cascading.

When not to use lazy pull

  • Hot-observable-style event fanout — when multiple consumers must each see every event as it happens, the stream needs to produce whether or not anyone is pulling. Stream.broadcast() is the push-based multi-consumer alternative.
  • Sources with intrinsic timing — a 60 Hz video frame source must produce at 60 Hz regardless of consumer; pull semantics would back-pressure the camera, which is nonsense.
  • Bridging a push-based producer — when the upstream API pushes (WebSocket, EventSource, Web streams), an adapter is needed to convert push into pull. Often Stream.push() (with explicit backpressure policy) is the right inner surface, with lazy pull on the outer consumer.

Seen in

Last updated · 200 distilled / 1,178 read