PLANETSCALE 2024-04-24 Tier 3

PlanetScale — The MySQL adaptive hash index¶

Summary¶

Ben Dicken (PlanetScale, 2024-04-24, re-fetched 2026-04-21) publishes a pedagogical primer on MySQL's Adaptive Hash Index (AHI) — an in-memory hash cache InnoDB maintains on top of its B+tree indexes, keyed on frequently looked-up index values and pointing directly into pages inside the buffer pool. Dicken frames the AHI as a runtime-adaptive overlay that converts already-fast O(log N) B+tree descents into O(1) hash probes for the subset of the workload InnoDB observes is hot. Two benchmark runs at 390M rows demonstrate a 16% speed-up on single-value lookups (14,044 QPS → 16,701 QPS) and a 20% speed-up on small-hot-set lookups (9,232 QPS → 11,562 QPS). AHI is controlled by the innodb_adaptive_hash_index configuration option (default on), operationally observable via SHOW ENGINE INNODB STATUS \G; under the INSERT BUFFER AND ADAPTIVE HASH INDEX section.

Key takeaways¶

InnoDB does not support on-disk HASH indexes; the AHI is the in-memory substitute. "If you try to create an index with USING HASH on an InnoDB-powered table, MySQL will instead create a B-tree index." Verified via information_schema.statistics: a CREATE INDEX ... USING HASH emits a silent BTREE index + warning.
AHI is a cache on top of the buffer pool, not on top of the B+tree. "The pointers in the adaptive hash index only point to data within the buffer pool." AHI entries are invalid if the target page is not buffer-pool-resident — so AHI and buffer pool form a two-tier dependency where the buffer pool is the substrate and the AHI is the overlay. "Thus, the buffer pool needs to be sufficiently large for the AHI to kick in. If it is small and there are a lot of evictions taking place, it is not worth using it." (Source: concepts/adaptive-hash-index.)
AHI adapts to workload. MySQL disables it automatically when conditions are wrong. "As its name suggests, the adaptive hash index (AHI) is constructed at runtime, and its usage adapts to the characteristics of your workload." + "MySQL is able to automatically adjust its use of the AHI based on the behavior it observes in the buffer pool. If conditions are not right for its use (few repeated lookups, small buffer pool, etc), MySQL will reduce or eliminate its use." This is the runtime- adaptive in-memory index pattern — the system detects repeated lookups against the same values and materialises a hash entry (with either the full value or a prefix) that short-circuits future B+tree descents.
AHI keys can be prefixes, not just full values. "If MySQL observes that a particular value is getting repeatedly looked up in a B-tree index, an entry in the AHI can be created either with the full value, or a prefix of the value." Prefix-keyed hash entries amortise the AHI across all lookups sharing a common prefix — a design choice that mirrors prefix-index mechanics on B+tree secondary indexes without incurring the prefix's selectivity penalty.
Benchmarked single-value lookup speed-up is 16% on a 390M-row table with a B+tree username index only 4 levels deep. 500,000 repeated executions of SELECT user_id, username, bio FROM user WHERE username = 'willpeace1'; drop from 35.6s → 29.94s. With AHI disabled: "0.00 hash searches/s, 418334.67 non-hash searches/s"; with AHI enabled: "350953.05 hash searches/s, 50985.01 non-hash searches/s" (some non-hash searches remain — "the query still needs to access the actual data in the row, not just the index value").
Benchmarked small-hot-set lookup speed-up is 20%. 1,000-value random pool (workload shaped like "a database and workload with a large amount of cold data, and a small amount of hot data") — 500k queries drop from 54.16s → 43.24s. Larger speed-up than the single-value case because the non-AHI baseline has more buffer-pool miss-rate variance across the 1,000-value working set.
Workload fit is load-bearing. AHI helps workloads with (a) repeated lookups on the same subset of index values, (b) a buffer pool large enough to hold the hot pages the AHI points into, (c) B+tree depths meaningful enough that O(log N) pointer-chases are a measurable fraction of query cost. Shallower trees + smaller buffer pools + more-random access patterns see less benefit; Dicken's username example is only 4 levels deep and still yields 16–20% — workloads "using deeper B-tree indexes may see even more performance improvement".
AHI has overhead. It can be disabled. "There is a bit of overhead to maintaining this special hash index. The feature can be enabled or disabled via the innodb_adaptive_hash_index configuration option. It is typically enabled by default, but if you have a workload that you know will not benefit from it, it can be disabled using innodb_adaptive_hash_index=0." Production incidents — AHI latch contention on very- high-concurrency workloads — have historically been the common reason to turn it off, though Dicken's post frames the workload-shape mismatch more broadly.

Systems / concepts / patterns extracted¶

Systems: MySQL, InnoDB.
Concepts: Adaptive Hash Index (new), InnoDB buffer pool, B+tree, B-tree, Secondary index, Clustered index, Cache hit rate, Working set memory.
Patterns: Runtime-adaptive in-memory index (new).

Operational numbers¶

Table size: 398,748,007 rows (~390M) user table.
B+tree depth for username secondary index: 4 levels (verified via innodb_ruby).
InnoDB page size: 16 KB default.
Hash table size: 276,707 entries at benchmark time (from SHOW ENGINE INNODB STATUS).
Benchmark 1 — single-value repeated lookup (500k iterations):
AHI off: 35.6 s, 14,043.57 QPS.
AHI on: 29.94 s, 16,701.1 QPS (+16%).
AHI on: 350,953 hash searches/s + 50,985 non-hash searches/s.
AHI off: 0 hash searches/s + 418,334 non-hash searches/s.
Benchmark 2 — 1000-value random hot-set lookup (500k iterations):
AHI off: 54.16 s, 9,231.62 QPS.
AHI on: 43.24 s, 11,562.05 QPS (+20%).
Config knob: innodb_adaptive_hash_index=1 (default on) / =0 (disabled).
Observation surface: SHOW ENGINE INNODB STATUS \G; → INSERT BUFFER AND ADAPTIVE HASH INDEX section.

Caveats¶

Pedagogy-voice post (~1,300 words, two worked benchmarks, no production incident retrospective). Ben Dicken's fourteenth wiki ingest; consistent with his "start from first principles, end with a benchmark" shape.
No coverage of AHI latch contention — the canonical production failure mode documented elsewhere (MySQL bug tracker, Percona blog) where AHI's btr_search_latch mutex becomes a bottleneck on very-high-concurrency workloads. Dicken frames AHI-off as a workload-fit decision, not a contention decision.
Hash table size 276707 is surfaced without explanation of how it's determined (InnoDB internally sizes AHI from buffer pool partition count + innodb_adaptive_hash_index_parts). Readers without prior context will see the number but not understand it.
Benchmark harness not disclosed — python3 same_query.py / load.py shape is evident but not open-source linked; no mention of connection pooling, client-server colocation, or statistical variance across runs.
Sequel to B-trees and database indexes (2024-09-09) — in fact published earlier (April vs September 2024), but the B-trees post is the foundational reference. The AHI post assumes buffer-pool + B+tree lookup mechanics as background.
Single-benchmark regime — only 500k iterations of the same or same-1000 queries. Real workloads have more turnover in the hot set; Dicken acknowledges "this specific workload is unrealistic" for the 1000-value case.
HASH vs BTREE index framing in MySQL ecosystem elides MEMORY engine (which does support real HASH indexes) — context a MySQL-novice reader would want.
No prefix-length / prefix-selection mechanics — Dicken mentions AHI can key on "the full value, or a prefix of the value" but doesn't expose how InnoDB decides which, what prefix length, or how prefix specialisation evolves as the workload shifts.
No mention of innodb_adaptive_hash_index_parts — the partition-count knob introduced in MySQL 5.7 to mitigate latch contention. Operators tuning AHI today read this knob first.

Source¶

Original: https://planetscale.com/blog/the-mysql-adaptive-hash-index
Raw markdown: raw/planetscale/2026-04-21-the-mysql-adaptive-hash-index-af850dba.md