CONCEPT Cited by 1 source
CPU-only media processing¶
Definition¶
CPU-only media processing is the deliberate choice to run image / video / audio processing workloads on CPU instances in the cloud, rather than on GPU instances — even when the underlying engine would run faster per-frame on a GPU. The optimisation target is fleet throughput and cost/performance, not single-instance speed.
The shape: tool supports both CPU and GPU rendering; the operator picks CPU; rendering is parallelised across many small CPU workers rather than concentrated on a few big GPU workers.
Canonical instance — Netflix MPS / FLAPI¶
Netflix's Media Production Suite runs FLAPI inside Cosmos on CPU-only instances for inspection, rendering, and trimming jobs — even though "FLAPI also supports GPU rendering." The explicit rationale (Source: sources/2026-04-24-netflix-scaling-camera-file-processing-at-netflix):
"CPU instances give us access to a much wider segment of Netflix's vast encoding compute pool and free up GPU instances for other workloads."
Two wins in one decision:
- Pool size: CPU capacity inside Netflix's AWS footprint is many-times larger than GPU capacity. CPU-only workloads can scale faster because the pool is bigger.
- Workload isolation: GPU instances are preserved for workloads that truly need them (ML training / inference), avoiding contention.
Why the traditional model went the other way¶
Traditional on-prem film-processing facilities invested in *"beefy computers with large GPUs and high-performance storage arrays to rip through debayering and encoding at breakneck speed." That model's constraint was fixed-hardware investment: once you own the GPU box, you want to saturate it.
Cloud flips the constraint: you don't own the hardware, you're paying per-instance-hour, and the cheaper-and-wider CPU pool dominates economically when the work can be parallelised across many small instances. Netflix's framing:
"Operating within these constraints lets us focus on increasing throughput via parallel encoding rather than focusing on single-instance processing power. We can then target the sweet spot of the cost/performance efficiency curve while still hitting our target turnaround times."
Enabling preconditions¶
CPU-only media processing works when:
- The tool supports CPU rendering at acceptable fidelity (FLAPI does; so do most mainstream encoders).
- Work can be parallelised per clip or per sub-segment (Netflix dispatches one Stratum Function per clip or sub-segment — see patterns/serverless-function-for-media-processing).
- Turnaround is measured in fleet throughput, not single-job latency — which is the case when thousands of parallel renders swarm a VFX turnover and then yield back.
- The CPU pool is larger / cheaper / less contested than the GPU pool — mostly true in large cloud footprints, where GPU capacity is a premium tier.
Contrast with GPU-targeted workloads¶
CPU-only isn't a universal ideal — it's a placement decision per workload. On the same Netflix infrastructure:
- ML training / inference → GPUs (see concepts/heterogeneous-ai-accelerator-fleet).
- Per-clip camera-file processing → CPU (this concept).
The discipline is: choose the placement that gives the widest available pool compatible with the workload's fidelity + SLA requirements, not the placement with the highest single-instance peak speed.
Seen in¶
- sources/2026-04-24-netflix-scaling-camera-file-processing-at-netflix — Netflix explicitly chooses CPU over GPU for FLAPI-driven media processing in Cosmos because the CPU pool is wider and GPUs are reserved for other workloads; the optimisation target is fleet throughput at the cost/performance sweet spot, not single-instance speed.
Related¶
- System: systems/filmlight-flapi — the dual-capable engine
- Substrate: systems/netflix-cosmos
- Consumer: systems/netflix-media-production-suite
- Contrast: concepts/heterogeneous-ai-accelerator-fleet — GPU-tier placement for workloads that need it
- Pattern: patterns/serverless-function-for-media-processing
- Company: companies/netflix