CONCEPT Cited by 1 source
Embedding-dimension diminishing returns¶
Definition¶
Embedding-dimension diminishing returns is the observation that increasing a vector-embedding's dimensionality past a certain threshold — empirically ~1,536 dimensions — yields worse retrieval and downstream-task outcomes, not better, despite the additional representational capacity.
Verbatim from Peter Corless (Redpanda, 2026-01-13):
"They realized that going up to or even beyond 1,536 vector embedding dimensions can have diminishing returns." (Source: sources/2026-01-13-redpanda-the-convergence-of-ai-and-data-streaming-part-1-the-coming-brick-walls)
The primary source is Supabase's pgvector blog post — "Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval" and related Supabase pgvector-performance posts — which document that higher-dimension embeddings cost more storage and compute per query without proportional relevance gains, and that Matryoshka-style dimension truncation can preserve most of the quality at a fraction of the cost.
Why this is a systems-infrastructure concern¶
- Vector-index storage scales with dimension. Doubling dimensions roughly doubles index size and memory residency.
- Similarity compute scales with dimension. Cosine / dot- product cost is O(d).
- Diminishing-returns ceiling caps the productive dimension. Pushing past ~1,536 costs more while delivering less.
- Matryoshka embeddings (OpenAI 2024; Supabase adopted) train embeddings so that a prefix of the dimensions is self-sufficient at lower dimension counts — letting production systems retrieve at low-dim first and only re-rank at full dim as needed.
Caveats¶
- Stub — single-sourced on wiki. The 1,536-dimension threshold cites a single Supabase blog post per Corless; deeper empirical evidence and per-model-family threshold differences aren't surveyed here.
- Not all embedding tasks are alike. The threshold is RAG-retrieval-oriented; classification, clustering, and reranking tasks may have different optimal dimensions.
- Model-specific. OpenAI
text-embedding-3-large(3,072 dim), Cohere's embed-v3 (1,024), and per-language-model embeddings all have different curves. 1,536 is illustrative, not universal. - Instance of a broader pattern. This is a specific case of S-curve limits — more parameters on any axis eventually stop helping.
Seen in¶
- 2026-01-13 Redpanda — The convergence of AI and data streaming, Part 1 (sources/2026-01-13-redpanda-the-convergence-of-ai-and-data-streaming-part-1-the-coming-brick-walls) — canonical: 1,536-dimension ceiling cited as evidence that "bigger model doesn't always mean better results."
Related¶
- concepts/s-curve-limits — the broader framing.
- concepts/retrieval-augmented-generation — the primary consumer of high-dimension embeddings.
- systems/openai-text-embedding-3-large — a 3,072-dim embedding system whose Matryoshka-truncation support is an instance of the dimensionality-pruning remediation.
- systems/transformer — the architecture primitive under embedding production.
- companies/redpanda — the company whose blog canonicalises this framing.