CONCEPT Cited by 1 source

TTFB vs TTLB (SSR measurement)¶

TTFB (time-to-first-byte) and TTLB (time-to-last-byte) are two latency-measurement choices for server-rendered HTTP responses, and they can produce substantially different rankings when comparing runtimes or platforms on streaming SSR workloads.

Definitions¶

TTFB — elapsed time from request dispatched to first byte of the response received. Captures server-side start latency: request routing, handler spin-up, time until the response-start write happens.
TTLB — elapsed time from request dispatched to last byte of the response received. Captures full work of generating and transmitting the response, including all streamed chunks and per-chunk transform cost.

Why the distinction matters for streaming SSR¶

A streaming SSR framework emits HTML incrementally: a fast shell flushes early, then suspense-gated regions stream in behind. Such a framework can have excellent TTFB and poor TTLB if per-chunk transforms (Web Streams pipe-through, JSON revivers, encoding conversions) are CPU-heavy. TTFB rewards the early shell flush; TTLB surfaces the transform + GC cost that accumulates over the full response.

The 2026-04-21 Vercel post names the methodology change explicitly:

The original benchmarks measured time-to-first-byte (TTFB), which captures when the server begins sending a response but not the full cost of generating and transmitting it. The benchmarks were updated to measure total request duration (time-to-last-byte). For server rendering workloads, this more accurately represents what users experience, as it includes the complete work of rendering and streaming the response.

Load-bearing consequence in the 2026-04-21 benchmark¶

Vercel's 28 % latency reduction for Bun vs Node.js on CPU-bound Next.js rendering is a TTLB number. Under TTFB, the gap would be substantially smaller because both runtimes can flush a shell quickly; the compounding per-chunk transform cost that Bun's Web-Streams implementation avoids is invisible to TTFB.

Structural rule: when comparing runtimes / platforms on streaming workloads, TTFB measures how fast the shell flushes; TTLB measures the runtime's per-chunk cost. For benchmarks whose purpose is runtime choice, TTLB is the more informative metric.

Orthogonal concerns¶

TTLB inherits all the usual benchmark-methodology caveats:

Client geography — a client in the same region (Vercel used iad1 + client VM in us-east-1) isolates runtime cost from network variance. Cross-region TTLB is dominated by network.
Payload size — transform cost scales with bytes; tiny responses can obscure per-chunk work.
Warmup — cold-start latency distorts TTLB on small samples, especially when runtimes differ on cold-start cost (Bun is slower than Node.js to initialise).
Concurrency profile — GC pressure only shows up under load.

FCP (first contentful paint) — browser-side; approximates TTFB + parsing time for the initial shell.
TTI (time-to-interactive) — fully-hydrated interactive moment; downstream of streaming SSR completion.
Throughput per dollar — the economic lens; under Active CPU pricing, TTLB gains translate ~1:1 into billing gains.

Seen in¶

sources/2026-04-21-vercel-bun-runtime-on-vercel-functions — canonical wiki introduction; Vercel's pivot from TTFB to TTLB mid-benchmark-response was the methodology choice that surfaced Bun's 28 % CPU-bound Next.js advantage.

concepts/streaming-ssr — the rendering shape that makes TTFB and TTLB diverge.
concepts/web-streams-as-ssr-bottleneck — the specific cost surface TTLB captures and TTFB misses.
systems/bun — the runtime whose Web-Streams implementation exposes the TTFB/TTLB gap.
systems/nodejs — the runtime whose current Web-Streams cost makes TTLB the discriminating metric.
systems/nextjs — the framework surfacing the gap.