CONCEPT Cited by 1 source
Autoregressive generation¶
Definition¶
Autoregressive generation is a sequence-modeling approach where a model produces output one token at a time, conditioning each new token on all previously generated tokens plus the input context. This is the standard decoding paradigm for decoder-only transformers (GPT-family LLMs) and is increasingly applied beyond text to recommendation, image, and structured-output domains.
Key properties: - Each token sees full left-context (causal masking) - Enables sequential decision-making amenable to reinforcement learning - Latency scales linearly with output sequence length (motivating techniques like hybrid decoding) - Supports constrained decoding by masking logits at each step
In recommendation systems¶
Netflix's GenPage applies autoregressive generation to homepage construction: the model generates rows and entities one token at a time, treating the user context as a prompt and the full page layout as the response. This enables whole-page optimization via RL and constrained decoding for business-rule enforcement (Source: sources/2026-06-29-netflix-genpage-generative-homepage-construction).