CONCEPT Cited by 1 source
TTFB vs TTLB (SSR measurement)¶
TTFB (time-to-first-byte) and TTLB (time-to-last-byte) are two latency-measurement choices for server-rendered HTTP responses, and they can produce substantially different rankings when comparing runtimes or platforms on streaming SSR workloads.
Definitions¶
- TTFB — elapsed time from request dispatched to first byte of the response received. Captures server-side start latency: request routing, handler spin-up, time until the response-start write happens.
- TTLB — elapsed time from request dispatched to last byte of the response received. Captures full work of generating and transmitting the response, including all streamed chunks and per-chunk transform cost.
Why the distinction matters for streaming SSR¶
A streaming SSR framework emits HTML incrementally: a fast shell flushes early, then suspense-gated regions stream in behind. Such a framework can have excellent TTFB and poor TTLB if per-chunk transforms (Web Streams pipe-through, JSON revivers, encoding conversions) are CPU-heavy. TTFB rewards the early shell flush; TTLB surfaces the transform + GC cost that accumulates over the full response.
The 2026-04-21 Vercel post names the methodology change explicitly:
The original benchmarks measured time-to-first-byte (TTFB), which captures when the server begins sending a response but not the full cost of generating and transmitting it. The benchmarks were updated to measure total request duration (time-to-last-byte). For server rendering workloads, this more accurately represents what users experience, as it includes the complete work of rendering and streaming the response.
Load-bearing consequence in the 2026-04-21 benchmark¶
Vercel's 28 % latency reduction for Bun vs Node.js on CPU-bound Next.js rendering is a TTLB number. Under TTFB, the gap would be substantially smaller because both runtimes can flush a shell quickly; the compounding per-chunk transform cost that Bun's Web-Streams implementation avoids is invisible to TTFB.
Structural rule: when comparing runtimes / platforms on streaming workloads, TTFB measures how fast the shell flushes; TTLB measures the runtime's per-chunk cost. For benchmarks whose purpose is runtime choice, TTLB is the more informative metric.
Orthogonal concerns¶
TTLB inherits all the usual benchmark-methodology caveats:
- Client geography — a client in the same region (Vercel
used
iad1+ client VM inus-east-1) isolates runtime cost from network variance. Cross-region TTLB is dominated by network. - Payload size — transform cost scales with bytes; tiny responses can obscure per-chunk work.
- Warmup — cold-start latency distorts TTLB on small samples, especially when runtimes differ on cold-start cost (Bun is slower than Node.js to initialise).
- Concurrency profile — GC pressure only shows up under load.
Related measurement axes¶
- FCP (first contentful paint) — browser-side; approximates TTFB + parsing time for the initial shell.
- TTI (time-to-interactive) — fully-hydrated interactive moment; downstream of streaming SSR completion.
- Throughput per dollar — the economic lens; under Active CPU pricing, TTLB gains translate ~1:1 into billing gains.
Seen in¶
- sources/2026-04-21-vercel-bun-runtime-on-vercel-functions — canonical wiki introduction; Vercel's pivot from TTFB to TTLB mid-benchmark-response was the methodology choice that surfaced Bun's 28 % CPU-bound Next.js advantage.
Related¶
- concepts/streaming-ssr — the rendering shape that makes TTFB and TTLB diverge.
- concepts/web-streams-as-ssr-bottleneck — the specific cost surface TTLB captures and TTFB misses.
- systems/bun — the runtime whose Web-Streams implementation exposes the TTFB/TTLB gap.
- systems/nodejs — the runtime whose current Web-Streams cost makes TTLB the discriminating metric.
- systems/nextjs — the framework surfacing the gap.