PATTERN Cited by 1 source
Constrained decoding for business rules¶
Pattern¶
At each autoregressive generation step, compute a mask of eligible tokens based on applicable business rules and apply it to output logits before sampling. This guarantees the generated output satisfies structural and product constraints without relying on post-hoc filtering or re-ranking.
Mechanism¶
- Define business rules as token-eligibility functions: deduplication (no repeated entities), row pinning (specific row at fixed position), category consistency (Comedy row → only comedy entities)
- At each generation step, evaluate rules against the generated prefix to compute a binary mask over the vocabulary
- Apply mask to logits (set disallowed tokens to -∞) before softmax/argmax
- Generate only rule-compliant tokens
Why domain tokenization makes this tractable¶
When each entity/row is a single token, business rules map directly to token-level masks. With text-based tokenization (multi-token entity names), constrained decoding requires complex multi-token bookkeeping and lookahead — far more expensive and fragile.
Example (Netflix GenPage)¶
- Row pinning: Pin "Popular Games" at position 2 → mask all non-"Popular Games" tokens at that position
- Deduplication: After generating entity X, mask token X from all subsequent positions
- Category consistency: Inside a "Korean TV Shows" row, mask all non-Korean-TV entities
(Source: sources/2026-06-29-netflix-genpage-generative-homepage-construction)