PATTERN Cited by 1 source
EXPLAIN-before-EXECUTE validation¶
EXPLAIN-before-EXECUTE validation is the safety pattern for
LLM-generated SQL: before running a generated query against the real
data, send it to the query engine's EXPLAIN endpoint. If EXPLAIN
fails (unknown table, unknown column, type error, invalid join), feed
the error back to the LLM with a bounded retry budget. Only
execute once EXPLAIN succeeds — and even then, enforce a
conservative LIMIT by default.
(Source: sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql.)
The loop¶
generate SQL ────► EXPLAIN ────► succeeds? ──yes──► EXECUTE (with LIMIT)
▲ │
│ │ fails
│ ▼
│ error + SQL + retry budget
│ │
│ ▼
└─── iterate (within bounded retries) ───
What the validation catches¶
Pinterest names four classes of failure the Analytics Agent guards against:
- Non-existent tables / columns — hallucinated identifiers.
- Invalid joins — join keys that don't resolve or have incompatible types.
- Syntax errors — malformed generated SQL.
- Semantic-but-not-syntactic errors — notably the "looks right
but returns nothing" failure mode when filter values don't match
actual data (e.g.
'web'lowercase vs'WEB'in storage). Pinterest mitigates this specific class with column profiling data at generation time, so the LLM knows the actual value distribution before emitting filters.
Why EXPLAIN is the right gate¶
- Cheap.
EXPLAINruns the planner, not the query — negligible cost vs. executing a bad query that scans terabytes. - Catches structural errors deterministically. Schema validation is not probabilistic — a column either exists or it doesn't.
- Provides actionable feedback. Query-engine error messages are well-formatted and point to specific tokens; easy to feed back into the LLM.
Bounded retry¶
Pinterest explicitly calls out "iterates with fixes up to a bounded retry limit." Unbounded retry on an LLM that can't reason about the schema produces infinite loops or degradation into invention. The bound is a safety property, not a quality degradation.
The default LIMIT 100 companion¶
Even once SQL passes EXPLAIN, Pinterest enforces "a conservative
LIMIT (100 rows or fewer) by default." EXPLAIN validates
structure; LIMIT bounds blast radius when the query is structurally
valid but semantically wrong (e.g. missing a WHERE clause on a billion-
row fact table). The two together form a defense-in-depth ladder:
EXPLAINcatches most structural errors.LIMITbounds the cost of remaining semantic errors.- Users see a small-row result and can say "that's wrong" cheaply.
Returns with transparency¶
On success, Pinterest's agent surfaces: the SQL that ran, the tables + date ranges used, source references (schemas, past queries, docs), and confidence indicators / warnings (suspicious joins, empty results). Transparency is a complement to validation, not a replacement: users still need to see what actually ran.
Seen in¶
- sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql — canonical wiki instance.