CONCEPT Cited by 2 sources
Index write cost¶
Every secondary index is a tax
on writes. Adding an index speeds up reads that can use it
and slows down every INSERT and UPDATE that touches an
indexed column. Three distinct cost axes compound: storage,
write latency, and planner overhead. This is the central
trade-off that makes "add an index to every column" an
anti-pattern — indexes are a scarce resource, not a free
upgrade.
Canonical framing¶
Justin Gage names the three-part cost structure in one paragraph:
One thing to remember about indexes is that while they're beneficial for read queries (including
JOINs and aggregates), they can negatively impact your database performance as a whole if you're not careful. Indexes take up a lot of space, even if that space is smaller than the initial table, and that space is precious; you'll need to keep an eye on how close you're inching towards your filesystem's limit. They also makeINSERTqueries take longer and force the query engine to consider more options before choosing how to execute a query.— sources/2026-04-21-planetscale-how-do-database-indexes-work
The three cost axes¶
1. Storage¶
Every secondary index is a separate B+tree that duplicates the indexed column(s) plus pointers to the base rows. For a 10-column 1-TB table:
- 1 primary-key B+tree (the table itself, in a clustered engine)
- N secondary-index B+trees, each holding (indexed column + PK pointer) per row
Even though each secondary index's per-row payload is
smaller than the full row, the index tree depth is
similar (same row count, same leaf capacity) so the
aggregate storage footprint of N indexes often exceeds
the base table once N ≥ 4–5. Gage's framing —
"space is precious … filesystem's limit" — is load-
bearing: indexes compete with data for disk and for
page-cache capacity.
2. Write latency¶
Every INSERT writes to the clustered index plus every
secondary index. Every UPDATE that changes an indexed
column writes to the clustered index plus every index that
covers that column. For N secondary indexes on a table:
INSERTcost: 1 + N B+tree inserts per row (ignoring buffer-pool locality).UPDATE col = ...cost: 1 + (# indexes that includecol) B+tree updates.DELETEcost: 1 + N B+tree deletes.
Each B+tree write can trigger page splits, hot-page
contention, and buffer-pool evictions. On high-write
workloads (OLTP, append-heavy time-series), adding a fifth
or sixth secondary index to an already-hot table often
visibly regresses INSERT throughput. Gage's framing:
"they also make INSERT queries take longer."
3. Planner overhead¶
The query optimiser considers every candidate index that could serve a given predicate. For each query:
- Retrieve index statistics (cardinality, histogram) per candidate
- Estimate rows-read / rows-returned per candidate
- Choose the cheapest plan
Adding more indexes directly enlarges possible_keys (see
concepts/mysql-explain's possible_keys column) and
increases the planner's branching factor. In pathological
cases the planner picks a suboptimal candidate (see
"
Why isn't MySQL using my index?" — the first_name vs
state canonical example). Gage's framing: indexes
"force the query engine to consider more options before
choosing how to execute a query."
Canonical quantification (not given in the post)¶
Gage's cost framing is qualitative. The post does not
quantify the N+1 multiplier, INSERT throughput regression
per additional index, or the planner-overhead / plan-time
cost at specific N. Downstream PlanetScale posts (Lien's
What are the disadvantages of database indexes?) develop
the write-cost discipline further — proposing
invisible indexes as the
safe-drop validation primitive when auditing unused indexes.
Mitigations¶
Three standard disciplines for keeping index write cost in check:
- Audit unused indexes. Use
patterns/cardinality-based-unused-index-detection and
performance_schema.table_io_waits_summary_by_index_usageto find indexes with zero reads and drop them (or first mark invisible). - Prefer composite indexes over many single-column
indexes. One three-column index on
(a, b, c)satisfies queries filtering ona,(a, b),(a, b, c)via the leftmost-prefix rule — cheaper than three separate single-column indexes on the write path. - Make indexes covering only when profitable. A covering index widens the secondary-index leaves, further inflating write cost. Reserve for hot read paths whose projection is stable.
Seen in¶
- sources/2026-04-21-planetscale-how-do-database-indexes-work — Justin Gage's canonical three-part framing (storage + write latency + planner overhead).
- sources/2026-04-21-planetscale-what-are-the-disadvantages-of-database-indexes — JD Lien's dedicated 2023-02-17 essay on the same cost discipline, with invisible-indexes as the safe-drop validator and cardinality-based-unused-index-detection as the audit mechanism.
Related¶
- concepts/secondary-index — the primitive whose write cost this concept captures
- concepts/composite-index — one composite index is often cheaper than N single-column indexes
- concepts/covering-index — widens the write-cost surface further
- concepts/mysql-invisible-index — the safe-drop validation mechanism
- patterns/cardinality-based-unused-index-detection — the audit pattern
- systems/mysql
- systems/postgresql