Skip to content

CONCEPT Cited by 1 source

Query history as knowledge base

Query history as knowledge base is the framing shift of treating accumulated SQL query logs — normally throwaway audit data — as a durable, searchable library of expert-authored analytical solutions.

(Source: sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql.)

Pinterest's thesis

"Your analysts already wrote the perfect prompt. Every SQL query an analyst has ever written — the tables they chose, the joins they constructed, the filters they applied, the metrics they computed — encodes hard-won domain expertise."

Traditional Text-to-SQL asks an LLM to figure out these patterns from scratch for every question. Pinterest instead treats query history as a library; the agent's job becomes making that library searchable by meaning.

Self-reinforcing cycle

Pinterest makes the compound-growth explicit:

  • New analytical patterns emerge as teams develop novel approaches to measurement.
  • Metric calculation standards evolve and propagate across teams.
  • Join conventions spread as validated patterns are reused.
  • Domain-specific filters and aggregations become discoverable to analysts outside the original domain.

Every new query an analyst writes becomes a new entry in the knowledge base. "The analyst who figures out how to compute retention by acquisition channel doesn't just answer their own question — they write a reusable recipe that any future analyst can discover by simply asking in plain English."

With 2,500+ analysts continuously teaching the system, the combined expertise of the entire analyst population becomes accessible to each individual analyst — rather than being siloed within teams.

How Pinterest makes the library searchable

Three ingredients (see SQL-to-intent encoding pipeline):

  1. Domain context injection before any LLM step — table docs, glossary terms, metric definitions.
  2. SQL-to-text to produce natural-language descriptions keyed on business questions the query answers.
  3. Embedding those descriptions into a unified context-intent embedding space for semantic retrieval.

The result: past queries are findable by what they were for, not by table name or column name.

Statistical signals extracted in parallel

Alongside intent embeddings, Pinterest extracts statistical signals from execution metadata that go into governance-aware ranking:

  • Table co-occurrence frequency — signals analytical relationships.
  • Query success rates — successful patterns weighted higher.
  • Usage recency + volume — recent, frequent patterns reflect current best practices.
  • Author expertise — queries from experienced analysts in specific domains carry higher weight.

Prior art on the wiki

The general insight that production logs / artefacts are institutional knowledge, not noise, appears repeatedly:

Pinterest's contribution is the specific shape for SQL + analytics: query history as the substrate, intent embeddings as the index.

Seen in

Last updated · 319 distilled / 1,201 read