CONCEPT Cited by 1 source

CPU-only media processing¶

Definition¶

CPU-only media processing is the deliberate choice to run image / video / audio processing workloads on CPU instances in the cloud, rather than on GPU instances — even when the underlying engine would run faster per-frame on a GPU. The optimisation target is fleet throughput and cost/performance, not single-instance speed.

The shape: tool supports both CPU and GPU rendering; the operator picks CPU; rendering is parallelised across many small CPU workers rather than concentrated on a few big GPU workers.

Canonical instance — Netflix MPS / FLAPI¶

Netflix's Media Production Suite runs FLAPI inside Cosmos on CPU-only instances for inspection, rendering, and trimming jobs — even though "FLAPI also supports GPU rendering." The explicit rationale (Source: sources/2026-04-24-netflix-scaling-camera-file-processing-at-netflix):

"CPU instances give us access to a much wider segment of Netflix's vast encoding compute pool and free up GPU instances for other workloads."

Two wins in one decision:

Pool size: CPU capacity inside Netflix's AWS footprint is many-times larger than GPU capacity. CPU-only workloads can scale faster because the pool is bigger.
Workload isolation: GPU instances are preserved for workloads that truly need them (ML training / inference), avoiding contention.

Why the traditional model went the other way¶

Traditional on-prem film-processing facilities invested in *"beefy computers with large GPUs and high-performance storage arrays to rip through debayering and encoding at breakneck speed." That model's constraint was fixed-hardware investment: once you own the GPU box, you want to saturate it.

Cloud flips the constraint: you don't own the hardware, you're paying per-instance-hour, and the cheaper-and-wider CPU pool dominates economically when the work can be parallelised across many small instances. Netflix's framing:

"Operating within these constraints lets us focus on increasing throughput via parallel encoding rather than focusing on single-instance processing power. We can then target the sweet spot of the cost/performance efficiency curve while still hitting our target turnaround times."

Enabling preconditions¶

CPU-only media processing works when:

The tool supports CPU rendering at acceptable fidelity (FLAPI does; so do most mainstream encoders).
Work can be parallelised per clip or per sub-segment (Netflix dispatches one Stratum Function per clip or sub-segment — see patterns/serverless-function-for-media-processing).
Turnaround is measured in fleet throughput, not single-job latency — which is the case when thousands of parallel renders swarm a VFX turnover and then yield back.
The CPU pool is larger / cheaper / less contested than the GPU pool — mostly true in large cloud footprints, where GPU capacity is a premium tier.

Contrast with GPU-targeted workloads¶

CPU-only isn't a universal ideal — it's a placement decision per workload. On the same Netflix infrastructure:

ML training / inference → GPUs (see concepts/heterogeneous-ai-accelerator-fleet).
Per-clip camera-file processing → CPU (this concept).

The discipline is: choose the placement that gives the widest available pool compatible with the workload's fidelity + SLA requirements, not the placement with the highest single-instance peak speed.

Seen in¶

sources/2026-04-24-netflix-scaling-camera-file-processing-at-netflix — Netflix explicitly chooses CPU over GPU for FLAPI-driven media processing in Cosmos because the CPU pool is wider and GPUs are reserved for other workloads; the optimisation target is fleet throughput at the cost/performance sweet spot, not single-instance speed.

System: systems/filmlight-flapi — the dual-capable engine
Substrate: systems/netflix-cosmos
Consumer: systems/netflix-media-production-suite
Contrast: concepts/heterogeneous-ai-accelerator-fleet — GPU-tier placement for workloads that need it
Pattern: patterns/serverless-function-for-media-processing
Company: companies/netflix