CONCEPT Cited by 1 source

LLM agent as query optimizer¶

Definition¶

LLM-agent-as-query-optimizer is the architectural pattern of applying a frontier-model LLM agent, given one or more database execution tools and a bounded rollout budget, to the problem of finding better execution plans for a SQL query. The agent's output is a concrete plan (join order, index choice, etc.) for the query optimizer to use — not a rewritten query and not a natural-language answer.

The approach was prototyped by Databricks + UPenn (Source: sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization) for join ordering and delivered a 1.288× geomean speedup / 41% P90 drop on the Join Order Benchmark.

Shape¶

The canonical instance has five distinguishing properties:

Property	This post's instance
Location	Offline, out-of-hot-path (not inside the query optimizer)
Tools	Single: execute a candidate plan, return runtime + subplan sizes
Budget	Fixed rollout count (50 in the prototype, 15 in eval)
Output constraint	Structured output grammar admitting only valid join reorderings
Selection	Best-of-N: return the fastest rollout observed

See patterns/llm-agent-offline-query-plan-tuner for the full pattern write-up.

Why offline, not hot-path¶

"Query optimizers typically need to pick join orders in a few hundred milliseconds, so integrating an LLM into the hot path of the query optimizer, while potentially promising, is not possible today."

Frontier-model inference latency (seconds to tens of seconds for agentic loops) is 100–1000× the optimizer's latency budget. The agent instead replaces the manual tuning loop a human DBA performs when a query is flagged as slow — see concepts/offline-query-tuning-loop. This reframes the LLM from component of the optimizer to offline collaborator with the optimizer, which is a much weaker latency constraint.

Why it beats perfect cardinality estimates¶

From the source:

"When using a frontier model, the agent was able to improve query latency by a factor of 1.288 (geomean). This outperforms using perfect cardinality estimates (intractable in practice)."

This is counter-intuitive but structurally correct: even with perfect concepts/cardinality-estimation, the cost model and search procedure still approximate real execution. The agent measures actual runtimes via its one tool, which short- circuits both approximation layers. You're no longer guessing what a plan will cost — you're observing it.

Contrast with "LLM in hot path" approaches¶

Prior work (cited as [B] in the post) put LLMs directly in the optimizer's decision path via RAG-based plan lookup. The Databricks experiment differs: "this differs from prior work that asks the LLM to pick a join order instantly in the hot path of the query optimizer; instead, the LLM gets to act like an offline experimenter that tries many candidate plans and learns from the observed outcomes — just like a human tuning a join order by hand!"

Open questions the authors name¶

From the post's conclusion:

Tool surface. Beyond execute a plan, should the agent be able to issue cardinality-probe queries or data- assumption tests (e.g. "are there any pre-1995 DVD releases?") directly? More tools → more leverage but also more context.
Trigger policy. User-flagged queries are easy. Can the system proactively detect queries with "optimization potential" and spend rollout budget on them automatically?
Learning back into the optimizer. If the agent finds a plan the optimizer missed, that's proof of a systematic estimator error. Can the corrections be harvested into a training signal that improves the native optimizer? This is the obvious next move.

Seen in¶

sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization — Canonical first wiki instance. Join ordering. 113-query JOB benchmark. Frontier model. Single execute-plan tool. Grammar-constrained structured output. 1.288× geomean, 41% P90 drop.

concepts/query-planner — what the agent collaborates with
concepts/join-order-optimization — the specific problem
concepts/cardinality-estimation — the step the agent sidesteps via direct measurement
concepts/anytime-optimization-algorithm — the algorithmic shape this instantiates
concepts/offline-query-tuning-loop — the human-workflow parallel
concepts/exploration-exploitation-tradeoff-in-agent-search — how the rollout budget is spent
patterns/llm-agent-offline-query-plan-tuner — full pattern write-up
patterns/structured-output-grammar-for-valid-plans — validity-via-grammar companion technique
systems/databricks-join-order-agent — the prototype