Skip to content

SYSTEM Cited by 1 source

TIGER (Generative Retrieval)

Definition

TIGERTransformer Index for GEnerative Recommenders — is a generative retrieval architecture for recommendation systems introduced by Google DeepMind at NeurIPS 2023. TIGER replaces the canonical recsys pattern of "score every item against the request" with a Transformer decoder that generates the semantic tokens of the next relevant item token-by-token via beam search.

Quote (from Source: sources/2026-06-02-instacart-from-scoring-to-spelling-rebuilding-ads-retrieval-at-instacart):

"Our new approach is inspired by TIGER (Google DeepMind), a method that demonstrates a model's ability to generate the semantic tokens of the next relevant item, rather than merely scoring a predetermined set of candidates."

Two structural ideas

TIGER's contribution is a structural one — bringing two ideas together that prior recsys retrieval kept separate:

  1. Semantic IDs from RQ-VAE (concepts/semantic-id) — items are encoded as short codeword sequences using a Residual Quantized VAE, with semantically similar items sharing prefixes. This compresses the vocabulary from catalog-size to codebook-size.
  2. Generative retrieval via autoregressive decoding (concepts/generative-retrieval) — recommendation becomes "spelling out" the next item's SID one codeword at a time. The prefix-sharing property of SIDs means autoregressive prefix conditioning enforces hierarchical retrieval discipline (once the first codeword is chosen, the search is constrained to the semantic neighbourhood that codeword represents).

Why both ideas are needed together

Either idea alone is insufficient:

  • RQ-VAE alone (without generative retrieval) gives you a compressed vocabulary but you still have to score it — and you still face the "flat probability distribution leaks across semantic neighbourhoods" problem.
  • Generative retrieval alone (over atomic product IDs as tokens) gives you autoregressive conditioning but you face the vocabulary bottleneck — catalog-size token vocabulary makes the model expensive and the embedding table sparse.

TIGER's contribution is recognising that both ideas compose to fix each other's downsides: RQ-VAE compresses the token space, and generative retrieval over the compressed token space exploits the prefix-sharing property to enforce semantic coherence.

Production references

The Source identifies a small set of production deployments:

  • Spotify: GLIDE and NEO — generative retrieval at Spotify's scale.
  • YouTube: PLUM"a multilingual model that reasons across both SID tokens and natural language"; the post identifies PLUM as the natural extension direction for Instacart's own work ("reason across both vocabularies").
  • Google DeepMind: ActionPiece — extends the substrate from item-tokenisation to user-action- tokenisation, "hinting at a future where the same architecture could power what we show users next: a reorder nudge, a recipe suggestion, or a discovery carousel".
  • Instacart: Instacart Generative Ads Retrieval — adapted to grocery's distinctive "shopping list spans fresh food to cleaning supplies and pet care all within a single session" shape.

Caveats

  • This is a stub page capturing TIGER as a referenced paper. The primary disclosure on the wiki is via Instacart's adaptation (systems/instacart-generative-ads-retrieval) which is itself a derivative implementation. The original TIGER paper has not been ingested separately on the wiki.
  • The original paper covers public benchmark datasets (Beauty, Sports, Toys); this page does not reproduce that benchmark detail.
  • Companion implementations (Spotify GLIDE/NEO, YouTube PLUM, ActionPiece) are referenced but not separately ingested as wiki systems.

Reference

Seen in

Last updated · 542 distilled / 1,571 read