CONCEPT Cited by 1 source

Denoise → encode → synthesize¶

Definition¶

Denoise → encode → synthesize is the three-stage encoding-pipeline shape used by AV1 Film Grain Synthesis. Rather than passing the source video directly into the codec, the pipeline:

Denoises the source video to strip grain.
Encodes the clean low-entropy result with the standard AV1 tools.
Synthesizes grain parameters (AR coefficients + piecewise-linear scaling function) from the residual and transmits them as a side channel with the compressed video. The decoder re-applies the grain.

(Source: sources/2025-07-03-netflix-av1scale-film-grain-synthesis-the-awakening)

This decomposition is architecturally significant because two of the three stages are not specified by the AV1 standard:

The denoiser is left fully to the encoder vendor — "the standard does not mandate a specific method for this step, allowing users to choose their preferred denoiser."
Grain parameter estimation is left to the encoder vendor — any method that produces valid AR coefficients + scaling function parameters is acceptable.

The standard only pins down the decoder-side synthesis procedure (generate 64×64 noise template, tile 32×32 patches, scale, add) plus the parameter format.

Why the decomposition matters¶

Block-transform codecs (AV1, HEVC, VP9, H.264) compress by exploiting spatial and temporal correlation. Grain is adversarial input: it has no exploitable correlation, so it eats bits out of proportion to its perceptual importance. The denoise stage converts an adversarial input into a codec-friendly input. Effect on each stage:

Stage	Without FGS	With FGS (denoise → encode → synthesize)
Source entropy	high (grain noise)	low (denoised signal)
Codec work	fights grain	compresses a clean signal efficiently
Output size	large (grain texture)	small (clean signal + KBs of grain params)
Decode result	decoded original	decoded clean + reconstructed grain

The denoiser pays for itself many times over because everything downstream is compressing a cleaner signal.

Where vendor investment lands¶

Because the AV1 standard does not specify the denoiser, the denoiser is where vendor competition happens. Good denoisers on grain-heavy content are materially better than naive ones, and a bad denoiser ruins the quality promise of FGS — if you denoise too aggressively you lose detail; too softly and you leave residual grain that the codec then struggles with. Netflix's 2021→2025 AV1-FGS timeline (see patterns/codec-feature-gradual-rollout) is plausibly dominated by denoiser development — the blog post explicitly flags the denoiser choice but does not reveal what Netflix uses.

The parameter-estimation stage is a similar per-vendor investment: AR coefficient fitting, scaling-function break- point placement, temporal stability across GOPs, colour- channel-specific tuning. All invisible from the decoder side but all shape the perceptual result.

Evaluation challenge¶

Reference visual-quality metrics (VMAF, PSNR, SSIM — see concepts/visual-quality-metric) compare decoded-against-source frame-by-frame. FGS output is not sample-wise identical to the source — the reconstructed grain is a statistically similar new noise instance, not a copy of the original sample grain. Reference metrics will score FGS output as heavily distorted even when human viewers cannot tell. This is why the Netflix post frames the benefit as "preserving the artistic integrity of film grain" rather than as higher VMAF.

Practical evaluation therefore relies on:

Side-by-side perceptual comparisons with human viewers.
Denoised-signal metrics — compare decoded-clean against source-denoised (both sides of the denoise stage).
No-reference grain-quality models — score the reconstructed grain on its own characteristics (spectral density, intensity vs brightness) without reference to the original sample.

The post does not specify which of these Netflix uses internally.

Generalisation¶

The pattern — strip a statistically-describable high-entropy component, encode the remainder efficiently, re-synthesize the component from parameters — generalises beyond grain:

Texture synthesis in video games / generative media (library of textures + per-region parameters).
Procedural audio (transmit envelopes + noise parameters rather than waveforms).
Neural residual coding (learned generators reconstructing non-essential high-entropy components).

AV1 FGS is the canonical wiki instance of this decomposition in a production streaming codec. See patterns/decoder-side-synthesis-for-compression for the architectural pattern framed independent of codec.

Seen in¶

sources/2025-07-03-netflix-av1scale-film-grain-synthesis-the-awakening — canonical wiki source.