CONCEPT Cited by 1 source
Unified context-intent embedding¶
Unified context-intent embeddings are Pinterest's named contribution to production Text-to-SQL: a single embedding space that indexes natural-language descriptions of the business question each historical SQL query was designed to answer, rather than table names or table descriptions.
(Source: sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql.)
What "unified context-intent" means¶
Two pieces unify:
- Context — Pinterest-specific domain signals injected into the
representation before embedding: table + column descriptions,
glossary terms (
g_advertiser_id/adv_idboth mapping toadvertiser_id), metric definitions, data-quality caveats. - Intent — the business question the query was designed to answer, extracted by an LLM via SQL-to-text transformation with three outputs per query (summary / analytical questions / detailed breakdown).
The combined vector sits in a shared space with user-question vectors at query time.
Why it works when table-description RAG doesn't¶
Traditional RAG over table descriptions fails because:
- Question wording doesn't match table description wording.
- Multiple tables match semantically but only one has the right join pattern for the business question.
- Company-specific metric conventions (e.g. "engagement rate" = specific action types / impressions at Pinterest) aren't in any description.
Unified context-intent embeddings sidestep all three: the index is keyed by what past queries answered, not by what tables look like. A user asking "What's the engagement rate for organic Pins by country?" matches a historical query's description regardless of the tables that query used.
The key design trick: the SQL-to-text step produces explicit "analytical questions this query could help answer", so user questions match question-to- question rather than question-to-table-description.
Generalization: strip specifics, keep semantics¶
Descriptions are kept deliberately generalizable: the LLM strips temporal specifics (exact dates, individual IDs) while preserving business-meaningful values (metric types, entity categories). A query originally for "October 2024 keyword performance" generalizes to match future questions about "ad CPC by keyword" regardless of date range.
Seen in¶
- sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql — canonical wiki definition; Pinterest Analytics Agent.
Related¶
- patterns/sql-to-intent-encoding-pipeline — how the embeddings are produced.
- patterns/analytical-intent-retrieval — how they are used at query time.
- concepts/analytical-question-bridge — the question-to-question matching design.
- concepts/query-history-knowledge-base — the substrate that is being embedded.