CONCEPT Cited by 1 source

Glossary-constrained translation¶

Definition¶

Glossary-constrained translation is the discipline of passing a domain glossary (brand terminology, proper nouns, domain terms-of-art) and a placeholder list (i18n template variables like {eta_minutes} that must round-trip verbatim) as explicit prompt arguments to an LLM-based translator, rather than hoping the model happens to respect them.

It is the operational discipline that makes LLM machine translation safe to use on strings whose rendering requires specific terms (brand names) or interpolation variables (runtime placeholders).

The mechanism¶

Lyft's Drafter prompt explicitly includes two load-bearing slots (Source: sources/2026-02-19-lyft-scaling-localization-with-ai):

GLOSSARY: {glossary}
PLACEHOLDERS (preserve exactly): {placeholders}

Text: {source_text}

{glossary} — list of Lyft brand terms + their required treatment ("Lyft" stays "Lyft"; airport codes stay English; official product names not translated, etc).
{placeholders} — list of template variables appearing in {source_text} that must appear verbatim in every candidate translation, with the same names.

Example from the post:

Input:  "Your {vehicle_type} is arriving in {eta_minutes} minutes"
Lang:   French
Country: Canada

Output candidates all preserve {vehicle_type} and {eta_minutes}
verbatim:
  "Votre {vehicle_type} arrive dans {eta_minutes} minutes"
  "Votre {vehicle_type} sera là dans {eta_minutes} minutes"
  "Votre {vehicle_type} arrivera d'ici {eta_minutes} minutes"

Why this matters¶

Two independent correctness axes that a pure semantic-translation prompt does not enforce:

Brand-voice / terminology correctness. Lyft's Evaluator has a dedicated rubric dimension ("Brand Alignment") that explicitly checks "proper nouns, airport codes, and brand names preserved in English". The Drafter's glossary argument is the upstream control; the Evaluator's rubric is the downstream check.
Runtime-rendering correctness. If {eta_minutes} is translated to {minutes_eta} or spelled in the target language, the consuming i18n templating engine will either throw a KeyError or render a literal {eta_minutes} to the user — both are user-visible bugs. The "preserve exactly" instruction in the prompt + rubric check at evaluation time is the only feasible guard when the translator is an LLM rather than a purpose-built MT engine.

Tradeoffs / gotchas¶

Prompt-argument-only is not enforcement. LLMs may still translate the placeholder (French verb for "vehicle" rewriting as "véhicule" inside the curly braces) or paraphrase the brand term. The Evaluator rubric is load- bearing as the enforcement layer; without it, glossary constraints are hopes, not guarantees.
Glossary maintenance is operational work. The glossary has to stay in sync with product naming, and it grows with each new Lyft product surface. Lyft's post does not disclose how the glossary is built / versioned / automated.
Placeholder syntax varies. {foo} (Python format strings), {{foo}} (Handlebars), %(foo)s (Python %), $foo (shell / some i18n), <x id="foo"/> (ICU XML) all exist. The prompt and rubric have to be language-agnostic or per-format.
Target-language-specific rules. Some languages need placeholders in different grammatical positions than the source; some need plural/gender agreement that the placeholder value carries but the surrounding text must match. These are still the translator's judgement calls — glossary-constrained prompting handles the name-preservation subset, not the syntactic-agreement subset.
No reference-free way to verify placeholder round-trip without parsing. Evaluator + downstream CI should run a templating-engine parse on every candidate; if the parse fails, the candidate is structurally incorrect regardless of the LLM's grade.

Relationship to adjacent concepts¶

concepts/machine-translation-with-llms — the broader category; glossary-constrained translation is the "operational discipline that makes MT-via-LLM safe for real product strings".
concepts/iterative-prompt-refinement — when the Evaluator catches a glossary or placeholder violation, refinement feedback to the Drafter explicitly includes the broken dimension.
concepts/structured-output-reliability — placeholder preservation is effectively a structural constraint on LLM output; same class of reliability argument.

Seen in¶

sources/2026-02-19-lyft-scaling-localization-with-ai — canonical wiki instance. Lyft's Drafter prompt includes GLOSSARY and PLACEHOLDERS as named arguments; the Evaluator rubric has a dedicated "Brand Alignment" dimension that checks glossary preservation. No numbers on how often the Drafter violates without enforcement, and no ablation on the placeholder-preservation win — the architecture is framed as obviously necessary.