SYSTEM Cited by 1 source

T5¶

Definition¶

T5 ("Text-to-Text Transfer Transformer") is Google's 2019 encoder-decoder transformer architecture framing all NLP tasks (classification, translation, summarisation, extraction) as text-to-text — input text produces output text. Positioned as the generalist encoder-decoder analogue to BERT's encoder-only approach.

Wiki anchor¶

The wiki's canonical anchor for T5 is its role as a realtime tail-query serving model alongside BERT in Yelp's query-understanding cascade (2025-02-04 post — sources/2025-02-04-yelp-search-query-understanding-with-llms).

Yelp's disclosure verbatim: "at Yelp, we have used BERT and T5 to serve as our real time LLM model. These models are optimized for speed and efficiency." T5 is well-matched to tasks where output is richer than a single label — e.g., review-highlight phrase expansion, where the output is a list of generated phrases.

Why T5 for realtime tail¶

Encoder-decoder architecture — supports generation-style outputs that BERT's encoder-only structure can't directly produce.
Smaller variants (T5-small, T5-base) are fast enough for realtime serving with small-GPU or even CPU inference.
Fine-tunable on teacher-generated golden datasets via text-to-text framing.

Stub¶

Minimal anchor. Deeper coverage (T5 vs. Flan-T5 variants, scaling-up behaviour, the C4 pre-training corpus) left for future ingests.

Seen in¶

sources/2025-02-04-yelp-search-query-understanding-with-llms — realtime tail-serving model alongside BERT.

systems/bert — sibling realtime tail model
systems/transformer — architectural parent
concepts/query-understanding — the task family

T5¶

Definition¶

Wiki anchor¶

Why T5 for realtime tail¶

Stub¶

Seen in¶

Related¶