PATTERN Cited by 1 source
Cardinality-based unused-index detection¶
Intent¶
Identify candidate secondary indexes that are structurally
unused — either empty, degenerate, or never populated with
distinct values — by querying
information_schema.STATISTICS for rows where
cardinality = 0 OR cardinality IS NULL. First-pass
filter for index audits without production telemetry.
Context¶
You have a MySQL database that has accumulated secondary indexes over time: some from initial schema design, some added during past performance incidents, some from speculative "just in case" additions. You suspect some are carrying unneeded write-amplification + storage cost (concepts/secondary-index), but you don't have production query-level telemetry to know which ones. You need a cheap static-time filter to generate a candidate drop list that you'll then validate individually.
Mechanism¶
Query information_schema.STATISTICS — the standard MySQL
per-index metadata view populated from
mysql.innodb_index_stats / per-engine stats — filtering
for indexes whose recorded cardinality is zero or NULL:
SELECT table_name, index_name, non_unique, seq_in_index,
column_name, collation, cardinality, sub_part,
packed, index_type, comment, index_comment
FROM information_schema.STATISTICS
WHERE table_schema = 'your_database_name'
AND index_name != 'PRIMARY'
AND (cardinality IS NULL OR cardinality = 0)
ORDER BY table_name, index_name, seq_in_index;
(Canonical source: sources/2026-04-21-planetscale-what-are-the-disadvantages-of-database-indexes)
Rows returned are candidates for removal. For each candidate:
- Optionally confirm with
SELECT COUNT(*) FROM table— if the table is non-empty, the zero-cardinality index is suspicious on its own. - Flip candidate to
INVISIBLEand measure production workload; this is the validation step. - If invisible-state performance is fine over at least one
full workload cycle, commit via
DROP INDEX. - If any query regresses, flip back to
VISIBLEimmediately.
Why it works (when it works)¶
information_schema.STATISTICS.cardinality stores the
number of distinct indexed-column values seen by the
storage engine's sampler (InnoDB samples
innodb_stats_persistent_sample_pages random B+tree pages
per table by default). A value of zero or NULL typically
indicates one of:
- The index was just created and never populated (rare in production).
- The table is empty (usually a lookup table that was never used).
- Every row shares the same value for the indexed column (degenerate — the index has no filtering power).
- The stats sampler hasn't run or has been disabled.
In each case the index is carrying its write-amplification + storage cost with zero read benefit, so dropping it is safe even absent measured-usage telemetry.
When this pattern is insufficient¶
The cardinality-equals-zero heuristic misses the far
more common production scenario: an index with high
cardinality (distinct values aplenty) that no production
query actually uses. Example: a CREATE INDEX email_idx ON
user(email) that was added in 2019 for a feature that got
deprecated in 2020 — cardinality is healthy (millions of
distinct emails) but the index is dead weight. The
cardinality heuristic is blind to this case by construction.
For this class of unused-but-high-cardinality index,
operators need production-measured query-level
telemetry: see
Hazen
2024 Tracking index usage with Insights for
PlanetScale Insights'
implementation via a patched
InnoDB index_init()
hook populating a
per-query used-index
set, surfaced as time-series in the Insights Indexes
tab. That telemetry is the measured-usage
counterpart to this pattern's structural heuristic.
Consequences¶
Benefits¶
- Zero cost — one SQL query against an in-memory metadata view; runs in milliseconds on any MySQL.
- Standard MySQL — no schema changes, no observability infra, no third-party tools.
- Safe by construction — returns a candidate list; the actual drop decision is still downstream.
- Pre-observability first-pass — operators without production telemetry stacks can still get useful signal.
Costs / caveats¶
- Lossy heuristic — an index with
cardinality = 0might be in active use on an empty table that is about to be bulk-loaded; dropping it would then require recreation. Verify table size before drop. - Misses unused-but-populated indexes (the common case). High-cardinality indexes never touched by production queries pass this filter. This pattern is necessary-but-not-sufficient for index auditing.
- Stats-sampler dependency — if
innodb_stats_persistent = OFFor the sampler hasn't run since a bulk load,cardinalitycan show zero for genuinely active indexes. CheckANALYZE TABLErecency before trusting the signal. - Does not measure usage. The name
information_schema.STATISTICS.cardinalityis not "usage count" — it's "distinct-value count." Operators who mistake it for a usage counter will drop actively-used indexes on high-cardinality but rarely-queried columns. - Destructive without invisibility validation. The candidate list is not an "automatically safe to drop" list — each candidate still needs the invisible-validate-drop cycle to confirm no production query relies on it.
Comparison to sibling patterns¶
| Pattern | Detection signal | Strength | Weakness |
|---|---|---|---|
| Cardinality-based (this pattern) | information_schema.STATISTICS.cardinality = 0 |
Zero infra cost | Misses populated-but-unused indexes |
Handler-counter table_io_waits |
performance_schema.table_io_waits_summary_by_index_usage.count_star = 0 |
Measures actual access | Table-granularity, SELECT-only |
patterns/bidirectional-index-query-attribution / Insights Indexes tab |
Per-query used-index set via index_init() hook |
Per-query-pattern time-series | PlanetScale-only; SELECT-only caveat |
The three patterns are complementary altitudes, not
mutually exclusive: cardinality filter is the cheap
first-pass; performance_schema is the OSS-MySQL
measured-usage signal; PlanetScale Insights is the
production-telemetry canonical.
Seen in¶
- sources/2026-04-21-planetscale-what-are-the-disadvantages-of-database-indexes
— JD Lien (PlanetScale, 2023-02-17) canonicalises the
information_schema.STATISTICScardinality query as the detection half of the two-step safe-drop workflow paired with invisible-index validation. Verbatim: "This query checks the cardinality of each index, which is the number of unique values in the index. If this value is 0, then the index is not being used." Pragmatic first-pass filter for pre-observability operators; the post itself flags the lossiness by recommending invisibility validation before the actual drop.
Related¶
- concepts/secondary-index — the thing being audited.
- concepts/mysql-invisible-index — the validation step that composes with this detection pattern.
- patterns/bidirectional-index-query-attribution — production-telemetry sibling pattern (PlanetScale Insights implementation).
- patterns/index-usage-per-table-diagnostic —
performance_schema-based sibling pattern (OSS MySQL). - concepts/index-usage-time-series — the production- telemetry output shape the telemetry patterns produce; strictly richer than this pattern's binary zero-or-nonzero signal.
- concepts/select-only-index-telemetry-caveat — the
structural caveat that limits both
index_init-hook andperformance_schemapatterns toSELECT-path telemetry (affecting which drop candidates are safe to actually drop). - systems/mysql — the engine;
information_schemasurface.