PATTERN

Composite index for multi-column AND-predicate¶

Pattern: when a high-frequency query filters on colA = x AND colB = y AND …, a single composite index on (colA, colB, …) is the correct index design — not two separate single-column indexes and not relying on Index Merge. Verify the before/after with EXPLAIN: the type column should flip from ALL → ref and the rows column should collapse from N → 1 (or close to the result cardinality).

(Source: .)

The canonical worked example¶

MySQL Employees sample database, 300k rows:

-- Question: find "Kendra Puppo"
SELECT * FROM employees
WHERE last_name = 'Puppo' AND first_name = 'Kendra';

Baseline — no index. EXPLAIN output:

type  : ALL
key   : NULL
rows  : 299,202

Full table scan. The optimiser reads the entire employees table to find one row.

Anti-pattern — two single-column indexes. Creating indexes on last_name and first_name separately lets MySQL find all Puppos or all Kendras, but:

If you create two separate indexes in this way, MySQL knows how to find all employees named Puppo. It also knows how to find all employees named Kendra. However, it doesn't know how to find people named Kendra Puppo.

—

MySQL's options are all worse than a composite index: use one index and post-filter, use Index Merge and pay the intersection cost, or ignore both and table-scan. "MySQL may decide not to use multiple indexes; even if it does, in many scenarios, they won't serve the purpose as well as a dedicated index."

Correct fix — composite index.

CREATE INDEX fullnames ON employees(last_name, first_name);

EXPLAIN after:

type  : ref
key   : fullnames
rows  : 1

Same query, same result, 299,202× fewer rows examined.

Why it works — mental model¶

A composite index is "a phone book placed inside another." The outer phone book is sorted by last_name; inside each last-name section is an inner phone book sorted by first_name. Finding ('Puppo', 'Kendra') is a single B+tree descent — the tree navigates to the Puppo subtree, then to Kendra within it. One I/O path, one leaf, one row.

Two separate indexes can't collapse the work this way: they each navigate one of the two predicates, leaving the other as a post-filter or requiring an Index Merge intersection.

Verification protocol¶

The EXPLAIN-based feedback loop is the contract:

Run the query unindexed. EXPLAIN → note type: ALL, rows: N.
CREATE INDEX name ON table(col1, col2, …) with the columns in the AND-predicate.
Re-run EXPLAIN. Verify:
type is ref (or eq_ref / const if the columns form a unique key).
key is your new index name.
key_len is approximately the sum of all indexed column widths — not just the leading column.
rows is the expected output cardinality (ideally 1 for point queries).
If key_len shows only the leading column's width, the later predicates aren't binding the index — check whether one is actually a range (>, <, BETWEEN), which breaks the prefix rule at that point.

Column order¶

Which column comes first in the composite index depends on the workload:

All queries filter on all columns → either order works; prefer the narrower column first to keep the B+tree shallower.
Some queries filter on only the leading subset (e.g. WHERE last_name = 'Puppo' by itself is common) → put the always-filtered column first, so the index is usable for both the full-predicate and leading-predicate workloads.
Selectivity optimisation → put the highest-cardinality / most-selective column first if both workloads exist and query mix is uncertain.

When NOT to apply¶

OR-predicates — WHERE colA = x OR colB = y — a composite index doesn't help; the leftmost-prefix rule doesn't apply to disjunctions. Two separate indexes (potentially with Index Merge) are the right shape here.
Range on leading column — if the leading column of the composite index is always in a range predicate, the subsequent equality predicates on later columns won't narrow the index walk — the range already broke the prefix chain. Reorder the index or add a different one.
Pure equality on very wide columns — if the joint key is wide enough that the index becomes bulky and the workload is strict equality only, a single BINARY(16) generated hash column with a UNIQUE INDEX is a compact alternative that trades range capability for index narrowness.

Relationship to other patterns¶

patterns/composite-hash-uniqueness-constraint — the hash variant: one 16-byte index replaces the composite index for strict-equality workloads. Different optimisation axis (index compactness vs range/prefix capability), same underlying "multi-column uniqueness / equality lookup" job.
patterns/generated-hash-column-for-equality-lookup — single-column hash variant for TEXT/BLOB-prefix-limited lookups.

Seen in¶

— canonical wiki source: Savannah Longoria walks the full before/after on the MySQL Employees sample dataset (Kendra Puppo, 299,202 rows → 1 row), names the two-indexes anti-pattern explicitly, and renders the phone-book mental model.