Skip to content

PATTERN Cited by 1 source

Fallback token for vocabulary evolution

Pattern

Define type-specific fallback tokens (e.g., [Entity_Fallback_Token], [Row_Fallback_Token]) that stand in for new vocabulary items not yet in the model's learned embedding table. During training, randomly replace a fraction of known tokens with their type's fallback, teaching the model to handle unknown tokens gracefully.

Mechanism

  1. Define one fallback token per entity type (movies, shows, games, rows, etc.)
  2. During training: with probability p, replace a known token with its type-specific fallback
  3. New tokens added between full retraining cycles inherit the fallback token's embedding as initialization
  4. At inference: unseen tokens use the fallback embedding + content-based embedding (via semantic embedding fusion)

Why it works

By training the model to generate useful outputs even when some inputs are fallback tokens, the system degrades gracefully when the vocabulary evolves faster than the training cycle. This is essential for domains with daily catalog additions (streaming platforms, marketplaces).

(Source: sources/2026-06-29-netflix-genpage-generative-homepage-construction)

Seen in

Last updated ยท 560 distilled / 1,653 read