CONCEPT Cited by 1 source

Cosine similarity¶

Cosine similarity is the cosine of the angle between two vectors:

cos(A, B) = (A · B) / (‖A‖ × ‖B‖)

It measures orientation, not magnitude — two vectors pointing in the same direction are maximally similar (score = 1) regardless of their lengths, opposite-pointing vectors score −1, and orthogonal vectors score 0. For unit-normalised embeddings (‖A‖ = ‖B‖ = 1), cosine similarity collapses to a plain dot product A · B — a single sum-of-products that vector-SIMD hardware executes in a handful of FMA instructions.

Why it dominates recommendation / retrieval workloads¶

Embedding models are typically trained so that semantic similarity corresponds to small angles between embeddings (concepts/vector-embedding). Cosine is the native metric.
The cheap form (pre-normalise once, then dot-product at query time) is trivially vectorisable and batches into a matrix multiply: a full M × N cosine-similarity sweep is C = A × Bᵀ where A is M × D candidates and B is N × D history items (patterns/batched-matmul-for-pairwise-similarity).
Cosine is the default metric for vector-similarity-search engines (FAISS, HNSW, pgvector, S3 Vectors); see concepts/vector-similarity-search.

Cosine vs inner product vs Euclidean¶

Metric	Formula	Use when
Cosine	`A·B / (‖A‖‖B‖)`	orientation matters, magnitudes aren't calibrated
Inner product	`A·B`	embedding model trained for IP; magnitudes encode salience
Euclidean (L2)	`‖A − B‖`	metric space distances (clustering)

Mismatch between the metric the embedding was trained against and the metric used at query time materially reduces recall — select the model's recommended metric.

Seen in¶

sources/2026-03-03-netflix-optimizing-recommendation-systems-with-jdks-vector-api — Netflix Ranker's video serendipity scoring computes per- candidate max cosine(candidate, history[i]) across the viewing history and returns 1 − max as a "novelty" feature. The naive O(M×N) nested-loop implementation cost 7.5% of total Ranker node CPU; reshaping into a batched matrix multiply with unit- normalised embeddings + JDK-Vector-API-FMA kernel drove it to ~1%.

concepts/vector-embedding — the input representation.
concepts/vector-similarity-search — the production-scale retrieval context.
concepts/matrix-multiplication-accumulate — C ← A·B + C, the hardware primitive cosine batches down to.
concepts/fused-multiply-add — the per-lane instruction.
patterns/batched-matmul-for-pairwise-similarity — the Netflix reshape from nested-loop cosine to a single matmul.

Cosine similarity¶

Why it dominates recommendation / retrieval workloads¶

Cosine vs inner product vs Euclidean¶

Seen in¶

Related¶