CONCEPT Cited by 1 source
HNSW index¶
Definition¶
An HNSW (Hierarchical Navigable Small World) index is a multi-layer proximity-graph ANN index with state-of-the-art recall / latency for in-memory vector search. It is the dominant in-memory vector index in practice — used by Lucene, FAISS, Vespa, Qdrant, and most mainstream vector DBs.
See systems/hnsw for the full system page.
Two structural properties¶
- RAM-bound. HNSW needs the full graph in memory for good performance; random-access edge traversal is the inner loop.
- No incremental reorganisation. Inserts extend the graph but full rebuilds are required periodically to maintain recall as the graph evolves — not compatible with OLTP write cadence.
Why these matter for databases¶
PlanetScale's 2024-10-22 vector-beta announcement names these exact two properties as the reason HNSW was rejected for a relational-database-hosted vector index:
"HNSW has very good query performance, but struggles to scale because it needs to fit its whole dataset in RAM. Most importantly, HNSW indexes cannot be updated incrementally, so they require periodically re-building the index with the underlying vector data. This is just not a good fit for a relational database."
—
Canonical wiki statement that "RAM-bound + no incremental updates" together disqualify HNSW for use inside a relational engine.
When it's the right choice¶
HNSW dominates when: corpus fits in RAM; writes are bulk / episodic; query latency at high recall is the performance target; the vector index can be an independent sidecar without transactional coupling to relational data.
Seen in¶
- — PlanetScale's original HNSW commitment (Nick Van Wiggeren, 2023-10-03). Verbatim: "we'll be implementing the state-of-the-art Hierarchical Navigable Small World (HNSW) algorithm, which constructs optimized graph structures that make it efficient to search vector similarity in large datasets." A year later, the 2024-10-22 vectors public-beta reversed this commitment on the RAM-bound + rebuild-required grounds now canonical above — the clearest case on the wiki of a stated algorithmic direction being reconsidered once relational- database constraints were surfaced. The 2023-10-03 post also doesn't name the structural properties; those articulations are a product of the intervening engineering work.
- — canonical wiki statement of HNSW's structural limitations as rejection criteria for a database-hosted vector index.
-
sources/2025-05-08-yelp-nrtsearch-100-incremental-backups-lucene-10 — Yelp Nrtsearch 1.0.0 exposes HNSW-index vector search via Lucene 10 (up to 4096 elements, float + byte vectors, multiple similarity types, scalar quantization for memory/recall tradeoff, nested- document support, intra-merge parallelism). Lucene's HNSW is the canonical production-search-engine deployment shape of this index type.
-
sources/2026-04-21-planetscale-larger-than-ram-vector-indexes-for-relational-databases — HNSW appears in two roles in PlanetScale's implementation. First, as a fully in-memory HNSW variant with transactional guarantees, shipped as an opt-in alongside the hybrid larger-than-RAM index: "our implementation of Vector Indexes in MySQL for PlanetScale supports fully in-memory HNSW indexes with transactional guarantees, with high performance and 99.9% recall if you're willing to support the trade-offs: you need to allocate enough memory to vector indexes in your database to fit your whole vector dataset, and incrementally updating these indexes is very expensive." Second, as the head index (centroid structure) of the hybrid SPANN-family larger-than-RAM index — replacing the SPANN paper's BK-Tree / K-D Tree choice because "an optimized implementation of HNSW beats these other data structures in every possible metric."
Related¶
- systems/hnsw
- concepts/ann-index
- concepts/diskann-index
- concepts/spann-index
- concepts/spfresh-index
- concepts/vector-similarity-search
- concepts/scalar-quantization — the standard memory- reduction lever on top of an HNSW index.