Skip to content

CONCEPT Cited by 1 source

Hypothetical-index evaluation

Hypothetical-index evaluation is the practice of costing a proposed database index without actually creating it, by registering a hypothetical index structure with the query planner and running EXPLAIN to get a cost estimate. The canonical Postgres implementation is the HypoPG extension, which lets the planner "use [hypothetical indexes] in the context of EXPLAIN commands" while incurring "no overhead" (sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions).

Why it matters

Every real secondary index costs real money — storage, memory, every INSERT/UPDATE/DELETE must maintain it, and building it on a large table requires a rewrite step that can lock or throttle live traffic. If you're unsure whether an index is worth recommending, you don't want to pay those costs just to check.

Hypothetical-index evaluation separates the cost of measuring whether an index would help from the cost of applying it:

  • Measuring = cheap — register the hypothetical index, run EXPLAIN with and without it, compare costs, throw the hypothetical away.
  • Applying = expensive — create the real index on disk, pay the write tax forever.

Mechanics

The canonical Postgres pattern (HypoPG):

  1. Define the index structure (CREATE INDEX …) but don't execute it.
  2. Register the structure with HypoPG — the planner now treats it as an existing index.
  3. For each query the index is meant to help, run EXPLAIN <query> — the planner returns an estimated cost that incorporates the hypothetical index if it would be chosen.
  4. Compare that estimate against an EXPLAIN without the hypothetical index registered.
  5. If the cost delta is substantial, the index is a real candidate; otherwise discard it.

Because step 3 uses the actual Postgres planner — not a synthetic cost model — the estimate reflects how the real production planner would behave, including join-order selection, index-choice logic, row-count estimates, and selectivity.

Use cases

  • AI/LLM-generated index suggestions — the canonical instance. An LLM proposes CREATE INDEX statements; hypothetical-index evaluation filters out the ones the planner disagrees with. See patterns/llm-plus-planner-validation.
  • Manual DBA tuning — a DBA evaluating candidate indexes can cost them before building, avoiding wasted CREATE INDEX CONCURRENTLY runs on large tables.
  • Automated index-advisor tools — any tool that recommends indexes can use hypothetical evaluation to score candidates.

Trade-offs and limits

  • It's a planner estimate, not a measurement. The planner's cost model is an approximation; actual runtime after creating the index can differ. For candidate-filtering ("reject useless suggestions cheaply") the approximation is good enough. For precise speed-up prediction it isn't — you still need an A/B measurement after applying.
  • Write-side cost isn't modelled. Hypothetical evaluation shows the read-side benefit. Whether the index's write-side write amplification is worth the read benefit is a separate decision.
  • Index build cost isn't visible. Building a real index on a 10 TB table takes hours and I/O; the hypothetical evaluation costs microseconds.
  • Stale stats = wrong estimates. The planner's cost model depends on up-to-date table statistics (ANALYZE). Hypothetical-index cost estimates are only as good as the stats.

Seen in

Last updated · 319 distilled / 1,201 read