Skip to content

SYSTEM Cited by 1 source

T5

Definition

T5 ("Text-to-Text Transfer Transformer") is Google's 2019 encoder-decoder transformer architecture framing all NLP tasks (classification, translation, summarisation, extraction) as text-to-text — input text produces output text. Positioned as the generalist encoder-decoder analogue to BERT's encoder-only approach.

Wiki anchor

The wiki's canonical anchor for T5 is its role as a realtime tail-query serving model alongside BERT in Yelp's query-understanding cascade (2025-02-04 post — sources/2025-02-04-yelp-search-query-understanding-with-llms).

Yelp's disclosure verbatim: "at Yelp, we have used BERT and T5 to serve as our real time LLM model. These models are optimized for speed and efficiency." T5 is well-matched to tasks where output is richer than a single label — e.g., review-highlight phrase expansion, where the output is a list of generated phrases.

Why T5 for realtime tail

  • Encoder-decoder architecture — supports generation-style outputs that BERT's encoder-only structure can't directly produce.
  • Smaller variants (T5-small, T5-base) are fast enough for realtime serving with small-GPU or even CPU inference.
  • Fine-tunable on teacher-generated golden datasets via text-to-text framing.

Stub

Minimal anchor. Deeper coverage (T5 vs. Flan-T5 variants, scaling-up behaviour, the C4 pre-training corpus) left for future ingests.

Seen in

Last updated · 476 distilled / 1,218 read