Skip to content

CONCEPT Cited by 1 source

Head-of-Line Buffering (in streaming pipelines)

Intermediate layers — reverse proxies, compression middleware, CDN edges, buffer-mode stream transforms — typically default to waiting until a size or time threshold is reached before forwarding data downstream. For non-streaming responses this is a pure win: fewer, larger packets, better compression ratios. For streaming responses, it is a correctness-equivalent bug: early chunks are held back until later ones (or an idle timeout) arrive, collapsing the stream back into a monolithic response from the client's perspective.

The phenomenon is sometimes called head-of-line buffering: the first ready chunk is blocked behind the buffer-fill condition, even though it is already produced and would be useful to the client immediately.

Where it bites

  • Nginx proxy_buffering on (default): upstream response chunks are held until the proxy buffer is full or the upstream closes.
  • HTTP compression middleware (e.g. Express compression): the gzip/brotli stream holds emitted bytes to improve ratio; without an explicit flush, the first compressed chunk can be held arbitrarily long.
  • NodeJS buffer-mode transforms: each _transform call may accumulate data before pushing downstream, especially when doing string/buffer round-trips for regex work.
  • CDN / edge caches: many refuse to stream chunked responses (buffer + forward as one).

Mitigations (from Confluence)

  • Response header X-Accel-Buffering: no — nginx-specific signal to disable proxy_buffering for this response only. Keeps the global default (which wants buffering for normal responses).
  • Force-flush compression on setImmediate after each upstream chunk — tells the middleware to emit whatever it has, trading a bit of compression ratio for chunk-boundary preservation.
  • Run stream transforms in objectMode so each chunk is one _transform call and there's no implicit buffer-fill threshold.
  • Detect chunk boundaries with a signal you control (setImmediate tick when React reports chunk complete) rather than relying on byte-count heuristics.

(Source: sources/2026-04-16-atlassian-streaming-ssr-confluence)

Generalization

The same anti-pattern appears in any streaming system with intermediate layers: log/trace pipelines (fluentd/vector buffer by default), Server-Sent Events behind load balancers, gRPC server streaming through L7 proxies. Audit every hop between producer and consumer; buffering defaults are almost always wrong for streams.

Seen in

Last updated · 200 distilled / 1,178 read