CONCEPT Cited by 1 source
Index selectivity¶
Selectivity is the fraction of a table's rows that a
query predicate against an indexed column is expected to
match — expressed as COUNT(DISTINCT col) / COUNT(*),
a number between 0.0000 and 1.0000. The higher the
ratio, the more effectively the index narrows the row set,
and the more attractive it is to the query optimiser when
multiple indexes could satisfy the same query.
(Source: sources/2026-04-21-planetscale-why-isnt-mysql-using-my-index.)
Formula and intuition¶
SELECT
COUNT(DISTINCT first_name) / COUNT(*) AS first_name,
COUNT(DISTINCT state) / COUNT(*) AS state
FROM people;
Aaron Francis's worked numbers on a 491,583-row people
table:
| column | selectivity |
|---|---|
id |
1.0000 |
first_name |
0.0060 |
state |
0.0000 (rounds down from ~4 × 10⁻⁶) |
A unique index is perfectly selective by definition:
COUNT(DISTINCT id) = COUNT(*), so the ratio is 1.0000.
Why it matters to the planner¶
When two or more indexes could answer a query, the MySQL optimiser picks the more selective one:
the optimizer has decided that the
first_nameindex is the best choice because thefirst_nameindex is more selective.— Aaron Francis, Why isn't MySQL using my index?
The reasoning is mechanical: more selective index → fewer
matching rows found in the B+tree walk → fewer clustered-
index walks to fetch full rows in the
two-step secondary-index
lookup. Francis's worked EXPLAIN on WHERE first_name =
'Aaron' AND state = 'TX' picks the first_name index:
possible_keys: first_name, state, key: first_name,
rows: 180. The state index with 2 distinct values is
structurally a worse choice: rows: ~246,944 estimated when
forced via USE INDEX (state) — a ~1,370× regression.
Selectivity vs cardinality¶
Selectivity and cardinality are related but distinct:
- Cardinality is the count of distinct values —
3,009 distinct
first_names on thepeopletable. - Selectivity is the ratio — 3,009 / 491,583 = 0.0060.
Cardinality alone is misleading without knowing the table
size. "Knowing that there are 3,009 distinct values in the
first_name column is interesting, but we have no idea if
that's a relatively high or low number compared to the
whole table!" The MySQL optimiser's persistent statistics
store cardinality (see
concepts/mysql-index-statistics) and derive selectivity
from it at query-plan time.
The skew caveat¶
Average selectivity is a rule of thumb, not a guarantee.
Skewed data breaks
the model: a type column with 99% "user" + 1% "admin"
has average selectivity of ~2 distinct / N rows ≈ 0, but
is highly selective for admin queries (matching 1% of
rows) and useless for user queries (matching 99%).
Francis canonicalises the caveat verbatim:
while checking average selectivity is a good rule of thumb, pay careful attention to unevenly distributed data.
— Aaron Francis, Why isn't MySQL using my index?
Mitigation: partial indexes (Postgres), functional indexes that narrow to the rare value, or tolerate that the index is only useful for the infrequent value.
Seen in¶
- sources/2026-04-21-planetscale-why-isnt-mysql-using-my-index — root-of-framing post: formula, worked 491,583-row example, optimiser-picks-more-selective rule, 1,370× regression when forcing wrong index, skew caveat.
Related¶
- concepts/index-cardinality — the count underneath the ratio.
- concepts/mysql-index-statistics — the sampled cardinality the planner actually uses.
- concepts/skewed-column-selectivity — when average selectivity lies.
- concepts/secondary-index — the two-step lookup cost selectivity minimises.
- concepts/composite-index — selectivity of the leading column dominates composite-index utility.
- concepts/mysql-explain — the diagnostic surface that exposes which index the planner picked.