Skip to content

CONCEPT Cited by 1 source

Embedding-dimension diminishing returns

Definition

Embedding-dimension diminishing returns is the observation that increasing a vector-embedding's dimensionality past a certain threshold — empirically ~1,536 dimensions — yields worse retrieval and downstream-task outcomes, not better, despite the additional representational capacity.

Verbatim from Peter Corless (Redpanda, 2026-01-13):

"They realized that going up to or even beyond 1,536 vector embedding dimensions can have diminishing returns." (Source: sources/2026-01-13-redpanda-the-convergence-of-ai-and-data-streaming-part-1-the-coming-brick-walls)

The primary source is Supabase's pgvector blog post — "Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval" and related Supabase pgvector-performance posts — which document that higher-dimension embeddings cost more storage and compute per query without proportional relevance gains, and that Matryoshka-style dimension truncation can preserve most of the quality at a fraction of the cost.

Why this is a systems-infrastructure concern

  • Vector-index storage scales with dimension. Doubling dimensions roughly doubles index size and memory residency.
  • Similarity compute scales with dimension. Cosine / dot- product cost is O(d).
  • Diminishing-returns ceiling caps the productive dimension. Pushing past ~1,536 costs more while delivering less.
  • Matryoshka embeddings (OpenAI 2024; Supabase adopted) train embeddings so that a prefix of the dimensions is self-sufficient at lower dimension counts — letting production systems retrieve at low-dim first and only re-rank at full dim as needed.

Caveats

  • Stub — single-sourced on wiki. The 1,536-dimension threshold cites a single Supabase blog post per Corless; deeper empirical evidence and per-model-family threshold differences aren't surveyed here.
  • Not all embedding tasks are alike. The threshold is RAG-retrieval-oriented; classification, clustering, and reranking tasks may have different optimal dimensions.
  • Model-specific. OpenAI text-embedding-3-large (3,072 dim), Cohere's embed-v3 (1,024), and per-language-model embeddings all have different curves. 1,536 is illustrative, not universal.
  • Instance of a broader pattern. This is a specific case of S-curve limits — more parameters on any axis eventually stop helping.

Seen in

Last updated · 470 distilled / 1,213 read