Skip to content

CONCEPT Cited by 1 source

TTFB vs TTLB (SSR measurement)

TTFB (time-to-first-byte) and TTLB (time-to-last-byte) are two latency-measurement choices for server-rendered HTTP responses, and they can produce substantially different rankings when comparing runtimes or platforms on streaming SSR workloads.

Definitions

  • TTFB — elapsed time from request dispatched to first byte of the response received. Captures server-side start latency: request routing, handler spin-up, time until the response-start write happens.
  • TTLB — elapsed time from request dispatched to last byte of the response received. Captures full work of generating and transmitting the response, including all streamed chunks and per-chunk transform cost.

Why the distinction matters for streaming SSR

A streaming SSR framework emits HTML incrementally: a fast shell flushes early, then suspense-gated regions stream in behind. Such a framework can have excellent TTFB and poor TTLB if per-chunk transforms (Web Streams pipe-through, JSON revivers, encoding conversions) are CPU-heavy. TTFB rewards the early shell flush; TTLB surfaces the transform + GC cost that accumulates over the full response.

The 2026-04-21 Vercel post names the methodology change explicitly:

The original benchmarks measured time-to-first-byte (TTFB), which captures when the server begins sending a response but not the full cost of generating and transmitting it. The benchmarks were updated to measure total request duration (time-to-last-byte). For server rendering workloads, this more accurately represents what users experience, as it includes the complete work of rendering and streaming the response.

Load-bearing consequence in the 2026-04-21 benchmark

Vercel's 28 % latency reduction for Bun vs Node.js on CPU-bound Next.js rendering is a TTLB number. Under TTFB, the gap would be substantially smaller because both runtimes can flush a shell quickly; the compounding per-chunk transform cost that Bun's Web-Streams implementation avoids is invisible to TTFB.

Structural rule: when comparing runtimes / platforms on streaming workloads, TTFB measures how fast the shell flushes; TTLB measures the runtime's per-chunk cost. For benchmarks whose purpose is runtime choice, TTLB is the more informative metric.

Orthogonal concerns

TTLB inherits all the usual benchmark-methodology caveats:

  • Client geography — a client in the same region (Vercel used iad1 + client VM in us-east-1) isolates runtime cost from network variance. Cross-region TTLB is dominated by network.
  • Payload size — transform cost scales with bytes; tiny responses can obscure per-chunk work.
  • Warmup — cold-start latency distorts TTLB on small samples, especially when runtimes differ on cold-start cost (Bun is slower than Node.js to initialise).
  • Concurrency profile — GC pressure only shows up under load.
  • FCP (first contentful paint) — browser-side; approximates TTFB + parsing time for the initial shell.
  • TTI (time-to-interactive) — fully-hydrated interactive moment; downstream of streaming SSR completion.
  • Throughput per dollar — the economic lens; under Active CPU pricing, TTLB gains translate ~1:1 into billing gains.

Seen in

Last updated · 476 distilled / 1,218 read