CONCEPT

Approximate vs exact kNN¶

Definition¶

The Elasticsearch-named distinction between two families of k-nearest-neighbor search over vector embeddings:

Exact kNN (brute-force): a script_score query that scans every matching document, computes the vector similarity function (cosine, dot-product, L2), and returns the top-K. Guarantees perfect recall.
Approximate kNN (ANN): a dedicated query type backed by an ANN index (in Lucene, HNSW graph). Sub-linear search; imperfect recall at lower latency.

Elastic's own framing, quoted verbatim from the documentation ():

"In most cases, you'll want to use approximate kNN. Approximate kNN offers lower latency at the cost of slower indexing and imperfect accuracy.

Exact, brute-force kNN guarantees accurate results but doesn't scale well with large datasets. With this approach, a script_score query must scan each matching document to compute the vector function, which can result in slow search speeds. However, you can improve latency by using a query to limit the number of matching documents passed to the function. If you filter your data to a small subset of documents, you can get good search performance using this approach."

Trade-off¶

Axis	Exact kNN	Approximate kNN
Recall	100 %	~90–99 % (tunable)
Query latency	O(N·D) per query	Sub-linear
Indexing cost	Low (just store the vectors)	Higher (build and maintain the ANN graph)
Scales to billions of vectors	No	Yes
Useful when filter is highly selective	Yes (filter → small set → brute-force OK)	Approximate-plus-filter interaction is subtle

When each wins¶

Exact kNN wins when a pre-filter cuts the candidate set to thousands, or when recall guarantees are non-negotiable. You also skip the ANN-index build cost and post-ingestion latency.
Approximate kNN wins at large-scale retrieval with tens of millions of vectors and a sub-100ms budget — the case Zalando's fashion catalog falls under (~450k items surfaced on the German women's root page; full catalog "much higher").

Seen in¶

— Zalando's Search & Browse department was running exact kNN via script_score on ES 7.17 and wanted to switch to approximate kNN. This was the single load-bearing driver of the ES 7.17→8.x upgrade, and simultaneously drove the Scala→Java client choice (since the unofficial elastic4s Scala client had no kNN DSL while the official Java client did). A canonical instance of a single algorithmic feature driving an entire major-version upgrade + 443k-LOC client migration.

concepts/ann-index — the data-structure family that makes approximate kNN viable.
systems/hnsw — the Lucene/Elasticsearch ANN index.
concepts/vector-similarity-search — the broader problem.
concepts/vector-quantization — orthogonal speedup technique.

Approximate vs exact kNN¶

Definition¶

Trade-off¶

When each wins¶

Seen in¶

Related¶