SYSTEM Cited by 1 source
Netflix GenPage¶
GenPage is Netflix's end-to-end generative homepage recommendation system that replaces the traditional multi-stage recommender pipeline (separate candidate generation + row-level ranking + entity-level ranking) with a single decoder-only transformer that autoregressively generates the entire homepage layout.
Architecture¶
- Model type: Decoder-only transformer (120M–900M parameters), untied input/output embedding weights
- Input (prompt): User engagement history, profile attributes, request context — all encoded as domain-specific tokens
- Output (response): Structured homepage as a sequence of row and entity tokens in layout order (left-to-right, top-to-bottom)
- Training pipeline: Pretrain (next-token prediction on positive-feedback pages) → Post-train (WBC or RL via Dr. GRPO)
- Inference: Autoregressive generation + hybrid row decoding + constrained decoding for business rules
Key Design Decisions¶
- Custom tokenization — Domain-specific tokens (one per entity/row) vs. text tokenizer. Reduces sequence length ~4× and enables direct token-level business rule enforcement.
- Hybrid row decoding — Autoregressively decode only the first few (highest-attention) positions per row; score remaining entities in a single forward pass.
- Multi-cadence training — Periodic broad retraining + daily incremental updates to maintain freshness without catastrophic forgetting.
- Semantic embedding fusion — Entity representations fuse learned ID embeddings with content-based embeddings for day-zero cold-start support.
Production Results¶
- 20% latency reduction vs. multi-stage production baseline (by eliminating feature computation and multiple ranking stages)
- Statistically significant engagement gains (p < 0.001) in 14-day A/B test
- Context enrichment yields ~5× more improvement than equivalent model capacity scaling