CONCEPT Cited by 1 source
LLM agent as query optimizer¶
Definition¶
LLM-agent-as-query-optimizer is the architectural pattern of applying a frontier-model LLM agent, given one or more database execution tools and a bounded rollout budget, to the problem of finding better execution plans for a SQL query. The agent's output is a concrete plan (join order, index choice, etc.) for the query optimizer to use — not a rewritten query and not a natural-language answer.
The approach was prototyped by Databricks + UPenn (Source: sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization) for join ordering and delivered a 1.288× geomean speedup / 41% P90 drop on the Join Order Benchmark.
Shape¶
The canonical instance has five distinguishing properties:
| Property | This post's instance |
|---|---|
| Location | Offline, out-of-hot-path (not inside the query optimizer) |
| Tools | Single: execute a candidate plan, return runtime + subplan sizes |
| Budget | Fixed rollout count (50 in the prototype, 15 in eval) |
| Output constraint | Structured output grammar admitting only valid join reorderings |
| Selection | Best-of-N: return the fastest rollout observed |
See patterns/llm-agent-offline-query-plan-tuner for the full pattern write-up.
Why offline, not hot-path¶
"Query optimizers typically need to pick join orders in a few hundred milliseconds, so integrating an LLM into the hot path of the query optimizer, while potentially promising, is not possible today."
Frontier-model inference latency (seconds to tens of seconds for agentic loops) is 100–1000× the optimizer's latency budget. The agent instead replaces the manual tuning loop a human DBA performs when a query is flagged as slow — see concepts/offline-query-tuning-loop. This reframes the LLM from component of the optimizer to offline collaborator with the optimizer, which is a much weaker latency constraint.
Why it beats perfect cardinality estimates¶
From the source:
"When using a frontier model, the agent was able to improve query latency by a factor of 1.288 (geomean). This outperforms using perfect cardinality estimates (intractable in practice)."
This is counter-intuitive but structurally correct: even with perfect concepts/cardinality-estimation, the cost model and search procedure still approximate real execution. The agent measures actual runtimes via its one tool, which short- circuits both approximation layers. You're no longer guessing what a plan will cost — you're observing it.
Contrast with "LLM in hot path" approaches¶
Prior work (cited as [B] in the post) put LLMs directly in the optimizer's decision path via RAG-based plan lookup. The Databricks experiment differs: "this differs from prior work that asks the LLM to pick a join order instantly in the hot path of the query optimizer; instead, the LLM gets to act like an offline experimenter that tries many candidate plans and learns from the observed outcomes — just like a human tuning a join order by hand!"
Open questions the authors name¶
From the post's conclusion:
- Tool surface. Beyond execute a plan, should the agent be able to issue cardinality-probe queries or data- assumption tests (e.g. "are there any pre-1995 DVD releases?") directly? More tools → more leverage but also more context.
- Trigger policy. User-flagged queries are easy. Can the system proactively detect queries with "optimization potential" and spend rollout budget on them automatically?
- Learning back into the optimizer. If the agent finds a plan the optimizer missed, that's proof of a systematic estimator error. Can the corrections be harvested into a training signal that improves the native optimizer? This is the obvious next move.
Seen in¶
- sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization
— Canonical first wiki instance. Join ordering. 113-query
JOB benchmark. Frontier model. Single
execute-plantool. Grammar-constrained structured output. 1.288× geomean, 41% P90 drop.
Related¶
- concepts/query-planner — what the agent collaborates with
- concepts/join-order-optimization — the specific problem
- concepts/cardinality-estimation — the step the agent sidesteps via direct measurement
- concepts/anytime-optimization-algorithm — the algorithmic shape this instantiates
- concepts/offline-query-tuning-loop — the human-workflow parallel
- concepts/exploration-exploitation-tradeoff-in-agent-search — how the rollout budget is spent
- patterns/llm-agent-offline-query-plan-tuner — full pattern write-up
- patterns/structured-output-grammar-for-valid-plans — validity-via-grammar companion technique
- systems/databricks-join-order-agent — the prototype