Skip to content

CONCEPT Cited by 3 sources

Composite-index column order

The order in which columns appear in a composite index definition determines which queries can use the index and how efficiently — and the right order is a function of the query's predicates, the column selectivity for those predicates on this data distribution, and the leftmost-prefix rule. Rules of thumb that generalise across tables are wrong as often as they are right; composite-index column ordering is a workload-dependent design decision.

The PlanetScale backup case

The canonical illustration is David Graham's backup-table case study:

Using the query statistics report we discovered this query takes an average of 719ms to find expired backups to delete. … We think this query should be faster because we have a composite index on (expires_at, data_deleted_at, deleted_at). However, the MySQL explain plan shows that the query planner chooses a full table scan over every row rather than using this index.

sources/2021-08-31-planetscale-optimizing-sql-with-query-statistics

The diagnosis: the leading column was the least selective predicate, not the most.

  • 97 % of rows matched expires_at <= CURRENT_TIMESTAMP → the leading column narrows nothing.
  • 97.5 % of rows had deleted_at IS NOT NULL → filtering for deleted_at IS NULL was highly selective.

Reordering to (deleted_at, expires_at, data_deleted_at) turned a full table scan (type: ALL, hundreds of thousands of rows visited) into a single-row range scan (type: range, rows: 1, filtered: 100.00). Average runtime dropped from 719 ms to under 20 ms — a 98 % reduction.

Why column order matters

A composite index on (A, B, C) is a single B+tree sorted lexicographically — first by A, then within each A-value by B, then within each (A, B)-pair by C. This gives the index:

  • Fast equality / range lookup on A (the whole tree is keyed by A at the top level).
  • Fast A = ? AND B = ? (once you've found the A-subtree, it's sorted by B).
  • Fast A = ? AND B = ? AND C = ? (same, one more level).

But:

  • Filtering on B alone requires scanning every A-value's subtree — useless. The leftmost-prefix rule means the optimiser won't even consider the index for this query.
  • Filtering on A with a wide range (e.g. 97 % of rows match) narrows the set to 97 % of the table — not enough to make index-based access cheaper than a full scan, so the optimiser rejects the index.

So the leading column should be (in priority order):

  1. A column the query filters on with an equality or narrow range (otherwise you can't enter the index at all).
  2. Among qualifying columns, the most selective for this query's predicates — i.e. the one that eliminates the largest fraction of rows.

The workload-dependent surprise

David's own framing makes the lesson precise:

We often include deleted_at as a trailing key in composite indexes for this reason, but this isn't quite right for the backup table.

sources/2021-08-31-planetscale-optimizing-sql-with-query-statistics

The general convention at PlanetScale — "most rows are live, so deleted_at IS NULL matches most of the table and belongs at the back of the index" — is inverted for the backup table. Daily backups replace their predecessors, so 97.5 % of rows are soft-deleted and deleted_at IS NULL is the rare case. The column ordering that is correct for most tables is exactly wrong for this one. See also skewed column selectivity for why selectivity is always predicate- and workload-dependent.

Mechanical rule

For a query WHERE A = x AND B = y AND C > z ORDER BY D, a good starting composite index is:

  1. Equality predicates firstA, B in any order that matches their selectivity for the query. Equality predicates can each narrow the lookup by one key-level.
  2. One range predicate last among key-colsC (the > predicate). Only the first range predicate uses the B+tree; subsequent range predicates become post-filter work.
  3. Then order-by columns (if including them avoids a filesort). For PlanetScale's backup case, leaving ORDER BY id out of the index leaves Extra: Using filesort in the plan — benign here because the result set is tiny.
  4. Then covered output columns (if you want a covering index that avoids the clustered-index lookup).

This yields the textbook mnemonic: equality, then range, then order, then cover.

Verification

The only reliable way to validate a column-ordering decision is EXPLAIN:

  • type: ALL with key: NULL → the optimiser rejected the index. Column order is probably wrong.
  • type: range or type: ref with your index in key → the optimiser is using the index as intended.
  • rows estimate close to the actual result-set size → the index is narrowing the search efficiently.
  • filtered: 100.00 → every row the index returns is in the result set (no post-filter).

See patterns/explain-driven-index-verification for the full discipline.

Not just MySQL

The same reasoning applies to composite indexes in PostgreSQL and every other relational database using B-tree indexes. The specific leftmost-prefix consequences and the optimiser's selectivity heuristics differ in detail, but the underlying rule — the order of columns in the index key is the order in which they narrow the lookup — is universal across B-tree indexes.

Trade-offs

  • More indexes vs. better-ordered indexes. If two different queries want different leading columns, you may need two indexes — but each index has a write cost (index write cost), so at scale you want the fewest that cover your patterns.
  • Index column order affects sort. An index on (A, B) can serve ORDER BY A, B ASC for free, but not ORDER BY B, A or ORDER BY A DESC, B ASC.
  • Ordering affects key_len of multi-column hits. MySQL's key_len in EXPLAIN tells you how many leading columns of a composite index the query is actually using — a quick audit for whether the column order matches the query.

Seen in

Last updated · 550 distilled / 1,221 read