Skip to content

SYSTEM Cited by 1 source

Pinterest Analytics Agent

The Pinterest Analytics Agent is an LLM-driven system that helps Pinterest's 2,500+ analysts discover tables, find reusable queries, and generate validated SQL from natural-language questions. Two months after launch it covered 40% of the analyst population (target 50% by year-end); it is the #1 agent at Pinterest, with 10× the usage of the next most-used agent.

(Source: sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql.)

Four-layer architecture

  1. Agent Orchestration Layer. An LLM with Pinterest-specific prompts classifies tasks (documentation lookup, table discovery, query discovery, Text-to-SQL, execution) and decides which tools to call in which order.
  2. MCP Integration Layer. MCP tools expose a unified interface to: table search (backed by systems/pinterest-vector-db-service + systems/pinterest-pincat), query search (over the query-description index built from the unified context-intent embedding pipeline), knowledge search (internal docs), and systems/presto execution with EXPLAIN validation.
  3. Context Layer. The knowledge foundation: PinCat schemas + tiers, vector indexes of tables and queries, expert-curated docs + metric definitions, and usage patterns from query logs.
  4. Execution Layer. Presto for validated SQL with EXPLAIN-before-EXECUTE, tight LIMITs (default 100 rows), and bounded error-recovery retries.

Core design principles

Asset-first

"The agent should surface existing, trusted assets — tables, curated queries, dashboards, metric definitions — before generating new SQL." Today the agent is asset-first for tables and queries; the roadmap extends to dashboards, datasets, metric definitions, curated query libraries, and workflow artifacts. (concepts/asset-first-agent-design)

Governance-aware ranking

Retrieval candidates are not ranked by semantic similarity alone. A Tier-1 table with active ownership and fresh data ranks higher than a semantically similar but deprecated or undocumented alternative. (concepts/governance-aware-ranking, patterns/governance-tier-ranking-fusion)

Validation-first SQL generation

Four disciplines gate every generated query:

  • Schema grounding — references only tables/columns in PinCat.
  • Column-profile-aware filters — filter values must match actual data, e.g. 'WEB' not 'web' (avoids the "looks right but returns nothing" failure mode).
  • Reuse validated patterns — join keys + filters are pulled from historical-query extraction, not reinvented.
  • EXPLAIN before EXECUTE, bounded retry on failure, default LIMIT 100.

Conflict-resolution hierarchy

When multiple context sources disagree:

  1. Expert-curated documentation (metric definitions, canonical guides).
  2. Schema metadata from PinCat (authoritative for structure).
  3. Query patterns (validated against schemas before use).
  4. General knowledge (fallback).

End-to-end query flow

User asks "Show me weekly retention for new users in the US over the past three months." The agent:

  1. Classifies the task as Text-to-SQL.
  2. Retrieves context in parallel: table search + ranking using the knowledge base; relevant historical queries via unified context- intent embeddings; table metadata from PinCat (tiers, owners, freshness); metric-definition / retention docs.
  3. Generates SQL with strict validation: schema-grounded, filter- value-correct, pattern-reusing, EXPLAIN-validated, retry-bounded, LIMIT-capped.
  4. Returns results with transparency: the SQL it ran, tables + date ranges used, source references (schemas, queries, docs), confidence indicators + warnings (suspicious joins, empty results).

Adoption

  • 40% of Pinterest's analyst population within two months of launch. Target 50% by year-end.
  • #1 agent at Pinterest, 10× the usage of the next most-used agent.
  • MCP tools for table + query search rank among Pinterest's most popular internal tools.

Evaluation

An internal benchmarking framework measures: (1) finding the correct tables for an analytical question, (2) generating correct SQL. Early results: table discovery meets expectations; SQL generation has room for improvement. Hardest cases:

  • Complex analytical logic — multi-step calculations and window functions that require chaining multiple reasoning steps.
  • Ambiguous business terms not yet captured in documentation.
  • Cross-domain queries surfacing conflicting join patterns or metric definitions.
  • Schema evolution — recently deprecated tables whose patterns still appear in the index.

Mitigations: human review, EXPLAIN validation, continuous index updates, expanded test coverage with SME-verified answers, real-user- interaction regressions.

Seen in

Last updated · 319 distilled / 1,201 read