Skip to content

SYSTEM Cited by 1 source

Netflix GenPage

GenPage is Netflix's end-to-end generative homepage recommendation system that replaces the traditional multi-stage recommender pipeline (separate candidate generation + row-level ranking + entity-level ranking) with a single decoder-only transformer that autoregressively generates the entire homepage layout.

Architecture

  • Model type: Decoder-only transformer (120M–900M parameters), untied input/output embedding weights
  • Input (prompt): User engagement history, profile attributes, request context — all encoded as domain-specific tokens
  • Output (response): Structured homepage as a sequence of row and entity tokens in layout order (left-to-right, top-to-bottom)
  • Training pipeline: Pretrain (next-token prediction on positive-feedback pages) → Post-train (WBC or RL via Dr. GRPO)
  • Inference: Autoregressive generation + hybrid row decoding + constrained decoding for business rules

Key Design Decisions

  1. Custom tokenization — Domain-specific tokens (one per entity/row) vs. text tokenizer. Reduces sequence length ~4× and enables direct token-level business rule enforcement.
  2. Hybrid row decoding — Autoregressively decode only the first few (highest-attention) positions per row; score remaining entities in a single forward pass.
  3. Multi-cadence training — Periodic broad retraining + daily incremental updates to maintain freshness without catastrophic forgetting.
  4. Semantic embedding fusion — Entity representations fuse learned ID embeddings with content-based embeddings for day-zero cold-start support.

Production Results

  • 20% latency reduction vs. multi-stage production baseline (by eliminating feature computation and multiple ranking stages)
  • Statistically significant engagement gains (p < 0.001) in 14-day A/B test
  • Context enrichment yields ~5× more improvement than equivalent model capacity scaling

Seen in

Last updated · 560 distilled / 1,653 read