PATTERN Cited by 1 source
LLM-plus-planner validation¶
Problem¶
LLM-generated database changes (index suggestions, query rewrites, schema mutations) are plausibly correct but sometimes wrong. Shipping unvalidated LLM output directly to production is unsafe. But asking humans to validate every suggestion doesn't scale — and the humans often make the same judgment the database engine itself could make.
The specific failure mode is confident-but-wrong. From PlanetScale's framing (sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions):
As anyone who's used LLM tools for software development can tell you, it's crucially important to validate LLM generated solutions before shipping them to production. Before we suggest that a customer make changes to their production database, we need to make sure that our suggestions will actually have the desired effect.
Solution¶
Pair the LLM with the database engine's own planner as a validation oracle. The LLM proposes changes; the planner decides whether each proposed change is worth surfacing.
The canonical shape for index suggestions:
- LLM proposes
CREATE INDEXstatements, each paired with a citation of which queries the index is designed to help. - Syntactic validation — parse the
CREATE INDEXstatement to confirm it's well-formed. - Planner validation — register each candidate as a
hypothetical
index (via HypoPG on Postgres), run
EXPLAINon each cited query with and without the hypothetical index, compare the planner's cost estimate. - Keep only suggestions the planner agrees with. Drop anything that doesn't "result in a substantial improvement" on at least one cited query.
The general shape extends beyond index suggestions — for any LLM-generated database change, pair the LLM with the engine's native validation primitive:
| LLM output | Validation oracle |
|---|---|
CREATE INDEX |
HypoPG + EXPLAIN cost delta |
| Query rewrite | EXPLAIN on both + result-set equivalence check |
| Migration script | Dry-run on a schema-branched clone + schema diff |
| Partition scheme | Planner cost on representative queries |
Why this works¶
- The LLM brings creativity; the planner brings ground truth. LLMs are good at finding candidate indexes from schema + workload context — comparable to human DBAs — but aren't trustworthy by themselves. The planner's cost model, even as an approximation, is anchored in real table statistics and actual query-execution logic.
- Validation is cheap. Hypothetical evaluation costs
microseconds; the real
CREATE INDEXwould cost hours. Running validation on every candidate is tractable. - The validator never ships a false positive. Even if the LLM hallucinates an index, the planner will find no cost reduction and drop the suggestion.
- False negatives are acceptable. If the planner's estimate is conservative and a genuinely useful index gets discarded, the customer's database isn't harmed — the worst case is a missed optimization opportunity, which matches the human-DBA baseline.
Forces¶
- The validation oracle must be cheaper than the change. Pattern breaks if checking each candidate costs as much as just building the index. Hypothetical-index evaluation satisfies this; hypothetical table rewrites don't exist, so the pattern doesn't generalise trivially to all migrations.
- The LLM must cite what each change is for. The
validator needs to know which queries to re-cost. An
CREATE INDEXwithout "here are the queries it helps" is untestable against the planner. - Planner estimates ≠ measurements. The pattern rejects clearly-bad candidates but doesn't guarantee the accepted ones will produce the exact speed-up shown. That's acceptable for a suggestion (the customer can accept or decline), not acceptable for automatic application without a post-apply A/B check.
Related patterns¶
- patterns/workload-aware-llm-prompting — the input counterpart. Workload-level filtering before the LLM sees anything; this pattern is the output-side validation after the LLM has generated candidates.
- patterns/explain-before-execute-validation — sibling Pinterest pattern for validating LLM-generated SQL queries against the planner before executing them. Same structural insight applied at query-execution time rather than at index-design time.
Instances¶
- PlanetScale Insights AI-powered index suggestions
(2026-04) — canonical wiki instance. LLM generates
CREATE INDEXstatements from a workload-pre-filtered prompt; HypoPG +EXPLAINvalidates each one against the planner's cost estimate; discards any candidate that doesn't "improve at least one candidate query." See sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions.
Seen in¶
- sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions — canonical instance; establishes the vocabulary.