SYSTEM Cited by 1 source

OpenAI text-embedding-ada-002¶

text-embedding-ada-002 is OpenAI's second-generation text embedding model from the original ada-002 family, documented at platform.openai.com. It produces a 1,536-dim dense unit vector from arbitrary text input via the OpenAI embeddings API.

It is the predecessor of the text-embedding-3 family (systems/openai-text-embedding-3-large sibling page on the wiki — 3,072 dim; configurable lower dims via the dimensions parameter / Matryoshka).

Where it appears on the wiki¶

systems/yelp-cs-chatbot — production substrate for Yelp's LLM-Assisted Customer Success Chatbot (2026-05-27 disclosure). Each segment of the chatbot's metadata-only vectorstore (article title + summary + headers + historical intent strings) is embedded by ada-002 into a 1,536-dim unit vector. Whole vectorstore is ~8 MB for ~370 articles with ~5 segments each. (Source: sources/2026-05-27-yelp-beyond-the-menu-tree-how-yelp-built-a-smarter-customer-success-chatbot)

Why 1,536 dimensions matters¶

The 1,536-dim figure is canonical on the wiki. It sits at the embedding-dimension diminishing-returns ceiling documented in the 2026-01-13 Redpanda framing of Supabase's pgvector empirical observation:

"They realized that going up to or even beyond 1,536 vector embedding dimensions can have diminishing returns." (Source: sources/2026-01-13-redpanda-the-convergence-of-ai-and-data-streaming-part-1-the-coming-brick-walls)

ada-002 is one of the substrates that anchors that threshold — it is the most widely-used 1,536-dim production embedding model. Newer Matryoshka-style models (ada's successor text-embedding-3-small at 1,536 default but truncatable down to 256/512; text-embedding-3-large at 3,072 truncatable down to 256/1,024/3,072) explicitly target the diminishing-returns property.

Production positioning¶

API-only. No on-prem / VPC option (per OpenAI public surface). RAG systems using ada-002 carry an explicit OpenAI vendor dependency in the embedding hot path.
Cost / rate-limit profile. Embedding latency and cost are amortizable when the vectorstore is constructed offline + cached (as in the Yelp chatbot's daily-batch model: embeddings are computed once at container-start and reused across all queries until the next daily refresh).
Single retrieval-query embedding per user query at inference time — much cheaper than per-token-of-corpus amortisation.

Caveats¶

Stub. This page is a minimal canonical anchor for ada-002's role in Yelp's CS Chatbot vectorstore; deeper model internals (training corpus, distillation lineage, comparison metrics vs text-embedding-3-small/large) are not surveyed.
Quantisation behaviour with FAISS. Yelp uses "smart indexing and quantization" via systems/faiss to reach ~8 MB for ~370 × ~5 = ~1,850 vectors × 1,536 dim. Without quantisation that's ~1,850 × 1,536 × 4 = ~11 MB; with FAISS PQ / scalar-quant the post's 8 MB number is achievable. The post does not disclose the exact FAISS index/quant scheme.
Recommended successor. OpenAI documents text-embedding-3-small as the recommended replacement for ada-002 (same default 1,536 dim, lower cost, higher benchmarks). Whether Yelp will migrate is undisclosed.

Seen in¶

sources/2026-05-27-yelp-beyond-the-menu-tree-how-yelp-built-a-smarter-customer-success-chatbot — Yelp CS Chatbot vectorstore.

systems/openai-text-embedding-3-large — successor-family member, 3,072 default dim, Matryoshka-truncatable.
systems/openai-api — the API substrate.
concepts/vector-embedding · concepts/embedding-dimension-diminishing-returns · concepts/retrieval-augmented-generation
systems/yelp-cs-chatbot · systems/faiss