Skip to content

CONCEPT Cited by 1 source

MySQL index statistics

MySQL index statistics are the sampled cardinality values MySQL stores per index to inform the query optimiser's index-selection decisions. The statistics are estimates, not exact counts — scanning an entire table to recompute cardinality on every planner invocation would be prohibitively expensive at scale, so MySQL uses random sampling and refreshes the stored value on a change- threshold trigger.

(Source: sources/2026-04-21-planetscale-why-isnt-mysql-using-my-index.)

How the statistics are collected

InnoDB samples random pages from each index, counts distinct key values across the sample, and extrapolates to the full table. The sampled cardinality lands in mysql.innodb_index_stats (for persistent stats) and is surfaced via:

  • SHOW INDEXES FROM tablename — per-index Cardinality column.
  • SELECT * FROM information_schema.STATISTICS WHERE TABLE_NAME = 'tablename' — queryable form.

Aaron Francis's worked example:

| Table  | Non_unique | Key_name   | Cardinality | Visible |
| people |          0 | PRIMARY    |      491583 | YES     |
| people |          1 | first_name |        3028 | YES     |
| people |          1 | state      |           1 | YES     |

The numbers are estimates — the state column likely has 2 distinct values (TX and one other) but the sample returned 1.

Auto-refresh at the 10% threshold

These statistics are automatically updated after 10% of a table has changed. This happens in the background, and you shouldn't ever notice it.

— Aaron Francis, Why isn't MySQL using my index?

The 10% threshold comes from the InnoDB persistent stats docs. Rows inserted, updated, or deleted count toward the change budget; once the delta exceeds 10% of the row count at the last refresh, InnoDB recomputes the sample in the background.

Manual refresh via ANALYZE TABLE

When stale statistics cause the planner to pick a bad index, the operator can force an immediate recalculation:

ANALYZE TABLE people;

This blocks briefly on the sampling work but is safe to run in production. The common trigger is right after a bulk load or mass update that changed data distribution dramatically but not enough to cross the 10% auto-trigger on a sufficiently large table.

Why staleness misleads the planner

The planner trusts the stored cardinality when ranking candidate indexes by selectivity. If the stats claim a column has 1 distinct value when it now has 1,000, the planner will underestimate the index's value and pick a worse plan. Symptoms: EXPLAIN shows the expected index in possible_keys but it isn't chosen, and rows estimates are wildly off from actual row counts.

Per-table tuning

If statistics sampling is too coarse for a particular table, the per-table override syntax lets you tune the sample size:

ALTER TABLE tablename STATS_SAMPLE_PAGES = 100;

Higher values produce more accurate stats at the cost of longer refresh time. The default is 20 (innodb_stats_persistent_sample_pages). Francis points at the MySQL docs for the details: "This is beyond the scope of this article, but the MySQL documentation has a comprehensive page on the topic."

Operational recipes

Symptom Diagnostic Fix
Planner picks wrong index after bulk load EXPLAIN shows stale rows estimate ANALYZE TABLE t;
Planner ignores correct index SHOW INDEXES cardinality looks wrong ANALYZE TABLE t;
Repeated bad plans on large volatile table Auto-refresh threshold too coarse Increase STATS_SAMPLE_PAGES per-table
cardinality = 0 OR NULL on live table Unused-index heuristic Validate before drop

Seen in

Last updated · 470 distilled / 1,213 read