CONCEPT Cited by 1 source
Analytical question bridge¶
The analytical question bridge is the load-bearing design trick in Pinterest's Text-to-SQL system: the SQL-to-text transformation step produces an explicit list of "analytical questions this query could help answer" per historical query, and the retrieval index is keyed on those questions. When a new user asks a natural-language question, the system matches question-to-question rather than question-to-table-description.
(Source: sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql.)
Why the indirection matters¶
The vocabulary mismatch between user questions and schema descriptions is the fundamental failure mode of naive Text-to-SQL RAG:
- Users phrase questions in business vocabulary: "engagement rate for organic content by country."
- Table descriptions are written in technical vocabulary: "Aggregated interaction metrics from user_actions joined to pins filtered by organic content type."
- Semantic retrieval between these two vocabularies is noisy.
By forcing the indexed side to be "questions this query answers", Pinterest makes both sides of the similarity query speak the same language: business-question to business-question. The LLM-generated analytical questions are written in the vocabulary analysts use.
The Pinterest example¶
From the post:
- Query: an ads performance SQL computing CPC + CPM by keyword for a specific advertiser.
- LLM-generated analytical questions:
- "What are the top-performing keywords by impressions for a given advertiser?"
- "How cost-effective are ad campaigns based on CPC and CPM for different keywords?"
- Future user question: "What's the CPC for our top keywords?" — matches the first analytical question directly, regardless of what tables the original query used or what vocabulary the table descriptions used.
Generalization¶
Pinterest keeps the analytical questions deliberately generalizable — temporal specifics (exact dates, individual IDs) are stripped while business-meaningful values (metric types, entity categories) are preserved. A query originally for "October 2024 keyword performance" answers future questions about "ad CPC by keyword" regardless of date range.
Why this is the canonical bridge¶
Pinterest's thesis: "This is what enables intent-based retrieval to work across different phrasings, table names, and column structures." The analytical-question bridge is the engineered semantic equivalence between user vocabulary and query-history vocabulary.