Skip to content

CONCEPT Cited by 2 sources

Fixed vs variable request cost

Definition

Every request to a server has two cost components:

  • Fixed cost — work done regardless of payload size. Network handshake, request parsing, authentication/authorization, logging, internal queueing, response framing.
  • Variable cost — work proportional to payload size. Byte-level processing: deserialisation, validation, persistence bytes, bytes sent back on the wire.

Redpanda's 2024-11-19 batch-tuning explainer frames this verbatim as the substrate argument for why batching exists at all:

"Although smaller requests may use fewer resources, they can never scale to zero. At some point, the cost of servicing the smallest possible request becomes more significant than the cost of the request itself. One way to think of this is that each request incurs a fixed cost (for work that happens regardless of the request size) and a variable cost (for work determined by what was requested)." (Source: sources/2024-11-19-redpanda-batch-tuning-in-redpanda-for-optimized-performance-part-1)

Why it matters

The asymmetry between fixed and variable cost is the reason batching exists as a first-class primitive across distributed systems:

  • If fixed cost were zero, batching would be unnecessary — N × (1-record requests) would cost the same as 1 × (N-record request).
  • If variable cost were zero, request size would be irrelevant — always batch maximally.
  • In practice, fixed cost is non-trivial (TCP segment boundaries, handler dispatch, broker bookkeeping, log append syscalls) and variable cost is also non-trivial (bytes on the wire, bytes written to disk, bytes parsed). The optimal batch size is the point where the ratio fixed_cost / records_in_batch has diminished to a negligible fraction of the per-record variable cost.

Consequences

  1. Small requests cannot scale to zero. Even if the record payload is 1 byte, the request carrying it has non-zero cost. At very small record sizes, nearly 100% of CPU goes to fixed- cost servicing.
  2. Batching is amortisation. Grouping N records into one request pays the fixed cost once for N records instead of N times. The per-record servicing cost approaches the pure variable cost asymptotically as batch size grows.
  3. Compression compounds the savings. Batching 100 records lets a compressor see 100× the context for finding repetition, so the compressed-bytes-per-record drops faster than linearly with batch size. "The compression ratio improves as you compress more messages at once since it can take advantage of the similarities between messages."
  4. Latency is the tax you pay for batching. Accumulating N records into one request requires waiting for the Nth record before dispatching — which introduces up to N × inter-arrival- time of per-record queueing delay. The producer-side linger.ms / batch.size trigger logic is the primitive for making this trade-off explicit.

Canonical instances

  • Network-to-broker batching — Kafka / Redpanda producer batching. Fixed cost = TCP segment + broker request-handling + append syscall + ISR acknowledgment. Variable cost = bytes in the batch. See concepts/effective-batch-size for the seven-factor framework.
  • Database batch writes. INSERT ... VALUES (...), (...), (...) vs N separate INSERT statements. Fixed cost = parse, plan, transaction coordination. Variable cost = row bytes.
  • GPU inference batching. Fixed cost = kernel launch overhead. Variable cost = matmul FLOPs. The fixed/variable asymmetry here is extreme (launch ~10 µs; matmul microseconds to milliseconds), which is why transformers serve better-than-linearly with batch size up to memory limits.
  • S3 / object-store uploads. Fixed cost = HTTPS handshake, request signing, bucket-index lookup. Variable cost = body bytes. Small-file uploads saturate quickly on fixed cost.

Seen in

Last updated · 470 distilled / 1,213 read