CONCEPT Cited by 1 source
Cosine similarity¶
Cosine similarity is the cosine of the angle between two vectors:
It measures orientation, not magnitude — two vectors pointing in the
same direction are maximally similar (score = 1) regardless of their
lengths, opposite-pointing vectors score −1, and orthogonal vectors
score 0. For unit-normalised embeddings (‖A‖ = ‖B‖ = 1), cosine
similarity collapses to a plain dot product A · B — a single
sum-of-products that vector-SIMD hardware executes in a handful of
FMA instructions.
Why it dominates recommendation / retrieval workloads¶
- Embedding models are typically trained so that semantic similarity corresponds to small angles between embeddings (concepts/vector-embedding). Cosine is the native metric.
- The cheap form (pre-normalise once, then dot-product at query time)
is trivially vectorisable and batches into a matrix multiply: a
full
M × Ncosine-similarity sweep isC = A × BᵀwhereAisM × Dcandidates andBisN × Dhistory items (patterns/batched-matmul-for-pairwise-similarity). - Cosine is the default metric for vector-similarity-search engines (FAISS, HNSW, pgvector, S3 Vectors); see concepts/vector-similarity-search.
Cosine vs inner product vs Euclidean¶
| Metric | Formula | Use when |
|---|---|---|
| Cosine | A·B / (‖A‖‖B‖) |
orientation matters, magnitudes aren't calibrated |
| Inner product | A·B |
embedding model trained for IP; magnitudes encode salience |
| Euclidean (L2) | ‖A − B‖ |
metric space distances (clustering) |
Mismatch between the metric the embedding was trained against and the metric used at query time materially reduces recall — select the model's recommended metric.
Seen in¶
- sources/2026-03-03-netflix-optimizing-recommendation-systems-with-jdks-vector-api
— Netflix Ranker's video serendipity scoring computes per-
candidate
max cosine(candidate, history[i])across the viewing history and returns1 − maxas a "novelty" feature. The naiveO(M×N)nested-loop implementation cost 7.5% of total Ranker node CPU; reshaping into a batched matrix multiply with unit- normalised embeddings + JDK-Vector-API-FMA kernel drove it to ~1%.
Related¶
- concepts/vector-embedding — the input representation.
- concepts/vector-similarity-search — the production-scale retrieval context.
- concepts/matrix-multiplication-accumulate —
C ← A·B + C, the hardware primitive cosine batches down to. - concepts/fused-multiply-add — the per-lane instruction.
- patterns/batched-matmul-for-pairwise-similarity — the Netflix reshape from nested-loop cosine to a single matmul.