Skip to content

CONCEPT Cited by 1 source

Client-side compression

Client-side compression is the choice to compress payloads at the client before sending them over the network (and likewise decompress server responses on the client), instead of letting the server compress on receive / decompress on send. The point is to push the CPU cost onto the caller, which is typically a larger + more elastic CPU pool than the database or DAL tier.

Why client-side, not server-side

Most databases offer server-side compression, so why push it to the client? Netflix's KV DAL stated reasons:

"While many databases offer server-side compression, handling compression on the client side reduces expensive server CPU usage, network bandwidth, and disk I/O." (Source: sources/2024-09-19-netflix-netflixs-key-value-data-abstraction-layer)

Three distinct wins from one choice:

  1. Server CPU — the DAL / database tier is a scarcer pool than the fleet of calling microservices; pushing compression onto clients frees server cores for request processing.
  2. Network bandwidth — compressed bytes go across the wire, not uncompressed ones.
  3. Disk I/O — if the server stores what it receives (Cassandra commit log, etc.), compressed bytes land on disk, reducing write throughput pressure.

The Netflix data point

Enabling client-side compression in one Netflix deployment (which "helps power Netflix's search") reduced payload sizes by 75%. No algorithm is named in the post; Netflix highlights the outcome as a material cost-efficiency win.

Client capability must be negotiated

Client-side compression requires that the server know to expect compressed bytes. KV DAL uses in-band signaling: each request carries a signal indicating whether the client supports compression (and, presumably, which algorithm). Without this, the server would have to statically assume — and statically-typed assumptions are the same configuration-stiffness problem signaling was introduced to solve.

Trade-offs

  • Client CPU — compression work moves to the client. For large-fleet callers this is typically fine (more cores, more elasticity); for constrained clients (mobile, embedded) it can be wrong.
  • Smaller payloads compress worse — for very small writes the CPU cost can outweigh the bandwidth win. Netflix's 75% number is for one specific deployment; not a general guarantee.
  • Algorithm choice is a compat axis — different libraries / languages have different preferred algorithms; signaling has to cover this.
  • Doesn't compose with server-side compression on the stored bytes. If Cassandra is configured with SSTable compression, you compress twice for write and decompress twice for read. KV DAL presumably manages this interaction, but the general pattern should consider it.

Seen in

Last updated · 319 distilled / 1,201 read