Skip to content

PATTERN Cited by 1 source

Workload-aware LLM prompting

Problem

LLMs are asked to solve database-tuning problems by looking at schemas and query text alone. Two consequences follow:

  1. They can't tell whether anything needs changing at all. Asked "what indexes should I add?", an LLM will almost always suggest something — even if the workload is running fine. From sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions: "LLMs are great at finding an answer but can't always be trusted to decide if maybe nothing needs to change at all."
  2. They can't prioritise. Without knowing which queries are hot, which are cold, and which are reading disproportionately more rows than they return, the LLM can't tell which query patterns are actually worth indexing.

Handing the LLM the entire schema + all query patterns produces long prompts, over-eager recommendations, and a validation workload dominated by junk candidates.

Solution

Don't ask the LLM the "should anything change?" question. Answer that question yourself from workload telemetry, then hand the LLM only the subset of queries that warrant attention — along with the minimal schema those queries actually touch.

The canonical shape from PlanetScale Insights:

  1. Collect per-query-pattern workload telemetry — rows read, rows returned, total runtime, execution count. (Insights provides this for every query pattern on a PlanetScale Postgres cluster.)
  2. Pre-filter query patterns using index-candidate filtering — require a high rows-read:rows-returned ratio, ≥ 0.1% of aggregated runtime, and a minimum execution count. Anything failing these gates is excluded from the prompt entirely.
  3. Narrow the schema to the tables referenced by the surviving queries: "we filter the schema down to the tables referenced by the query. This keeps the prompt smaller and more focused."
  4. Ask the LLM to propose indexes for just these queries, and require it to cite which queries each proposed index is designed to help. The citation is load-bearing for the downstream validation step (see patterns/llm-plus-planner-validation).
  5. Pass each candidate through planner validation. The validator uses the citation to decide which queries to re-cost.

Why this works

  • The hard question isn't "what index?"; it's "which queries matter?" Workload telemetry answers that cleanly. Given a focused set of problem queries, LLMs design indexes well.
  • Prompt cost scales with candidate count. Filtering drops the prompt from every-pattern-on-the-cluster to a handful of genuinely-problematic patterns.
  • Validation load drops with signal ratio. If the input is high-quality (queries that probably do need an index), most candidates the LLM produces survive validation. If the input is low-quality, validation throws most of them out and the LLM+validator cycle is wasted.
  • Citations enable targeted validation. The LLM is asked to name which queries each index helps; the validator re-costs exactly those queries. Without the citations, the validator would have to re-cost the entire workload per candidate.

Forces

  • You need workload telemetry. This pattern only works when you can compute rows-read, rows-returned, runtime share, and frequency per query pattern. Systems without that telemetry can't build the pre-filter.
  • Telemetry must be representative. If the observation window is too short, you'll miss workload shifts; too long, and dead query patterns contaminate the filter.
  • Schema filtering must be tight. If the prompt still contains unrelated tables the LLM will propose indexes on those too, and validation becomes noisier.
  • The workload filter must encode economic truth, not just performance truth. High-ratio queries that run once a week aren't worth indexing — the write tax dwarfs the read benefit. The 0.1%-of-runtime + minimum-frequency gates encode that discipline.
  • patterns/llm-plus-planner-validation — the output-side counterpart. Workload-aware prompting shapes the input to the LLM; planner validation filters the output. Together they bracket the LLM.
  • patterns/explain-before-execute-validation — sibling pattern for LLM-generated SQL queries at execution time. Pinterest Analytics Agent applies similar discipline at the runtime boundary.
  • patterns/analytical-intent-retrieval — Pinterest's pattern of using query history as a retrieval corpus rather than table descriptions. Both patterns share the insight that domain-specific telemetry > raw schema.

Instances

  • PlanetScale Insights AI-powered index suggestions (2026-04) — canonical wiki instance. Insights' per-query telemetry filters candidate queries down to the subset that actually warrant indexing; schema is filtered to just the referenced tables; the LLM is then asked for indexes with a citation requirement. See sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions.

Seen in

Last updated · 319 distilled / 1,201 read