PlanetScale — PlanetScale is bringing vector search and storage to MySQL¶

Summary¶

Nick Van Wiggeren's 2023-10-03 announcement post declaring PlanetScale's intent to add native vector storage and search to its MySQL fork. At this point the feature is roadmap only — users are pointed at a sign-up page (planetscale.com/ai) to be notified at release. The post establishes the product thesis — AI/ML apps shouldn't need a second database just for vectors — and names the technical direction: a first-class vector data type plus vector-specific indexing inside MySQL, targeting the Hierarchical Navigable Small World (HNSW) algorithm. Explicitly frames raw vector storage as "not interesting" (a BLOB column already suffices) — the value is in indexed similarity search, which needs graph structures MySQL doesn't have. PlanetScale plans to add these to its existing MySQL fork, ship packages + containers for local development, and expose the feature transparently to existing customers ("one day you'll automatically gain the ability to do vector storage and retrieval"). This post is the origin point of PlanetScale's vector-search corpus; its stated HNSW direction was later rejected on structural grounds (RAM-bound + no incremental updates) in the 2024-10-22 beta announcement in favour of a transactional extension of SPFresh inside InnoDB — a notable reversal that the 2025-10-01 engineering deep-dive ( Larger than RAM Vector Indexes for Relational Databases) documents in depth.

Key takeaways¶

Thesis: don't adopt a second database just for vectors. Canonical product-level framing that the vector-index- inside-storage-engine pattern would later formalise:

"Soon, you'll be able to use PlanetScale as a vector database for all of your AI needs without needing to adopt a second tool. … Instead of adopting a second database just for vectors, you'll be able to do the same storage and retrieval right in PlanetScale, reducing cost and operational burden significantly." (Source: sources/2023-10-03-planetscale-planetscale-is-bringing-vector-search-and-storage-to-mysql)

Storing raw vectors is uninteresting; indexing them is the hard part. Verbatim framing that splits the problem into trivial storage vs non-trivial lookup:

"Modern databases are already very good at storing lists of numbers. Storing vectors as a raw datatype is not interesting … What makes vectors useful is a technique called embedding … That's where vector- specific indexing comes into play."

Sets up the argument that a BLOB column already works — the reason MySQL needs vector support is for search, not storage.

Stated algorithmic direction: HNSW. The post commits to Hierarchical Navigable Small World as the indexing algorithm:

"Specifically, we'll be implementing the state-of-the- art Hierarchical Navigable Small World (HNSW) algorithm, which constructs optimized graph structures that make it efficient to search vector similarity in large datasets."

This commitment was later abandoned — see concepts/hnsw-index's existing wiki note on the 2024-10-22 beta announcement rejecting HNSW for the same relational-database context. The structural reasons (RAM-bound; no incremental update) weren't named in this earlier post. Contradiction with 2024-10-22 announcement is noted below.

First-class vector data type, not BLOB. Post explicitly contrasts what users can do today (write arrays into BLOB columns) with what MySQL is getting:

"We know what a vector is, and storing the data is still straightforward — you can use a BLOB type and start writing arrays into MySQL today! So what extra support does MySQL need anyway? That's where vector- specific indexing comes into play. This is what we're adding to MySQL, along with a first-class vector data type."

Distribution strategy: extend the existing PlanetScale MySQL fork, not a new product. Van Wiggeren names the existing fork as the integration point:

"PlanetScale already maintains a fork of MySQL and we'll be adding vector types and indexes to it. When released, we'll run that MySQL fork in PlanetScale as we do today. We will publish packages and containers for our PlanetScale-flavored MySQL that will allow users to test and develop locally."

Transparent to existing customers — no migration required.

Pedagogy-altitude vector primer. About half the post is an accessible explanation of what vectors are (["a one-dimensional array of real number values"]), embeddings (["uses machine learning to transform arbitrary data like a picture, song, or sensor data into a vector"]), and similarity metrics (cosine similarity is linked explicitly). Pointer to Stephen Wolfram's What Is ChatGPT Doing … and Why Does It Work? § "The Concept of Embeddings" as the longer-form explainer.
Without an index, similarity search is linear over the corpus. Worked example: company-document similarity.

"Without an index, you would have to iterate over every document's vector in the database and compare them for similarity. At scale, this could take a while, and the performance would be awful! Using an index, you can efficiently traverse the graphs of vectors, and quickly present the user their meeting notes from the status meeting last week, or the design document for one component of their project."

Release stance: "rigorous workloads" before ship. No numbers, no beta date — post explicitly defers.

"It's exciting to see vector workloads working on MySQL. We are committed to maintaining a stable, reliable, and highly available product. We will continue to test our new vector support under rigorous workloads to ensure it meets our high standards before release."

Systems referenced¶

systems/planetscale — the MySQL fork getting vector support.
systems/mysql — the upstream engine PlanetScale forks.
systems/hnsw — the stated (later abandoned) indexing algorithm; post links directly to the arXiv HNSW paper (1603.09320).

Concepts invoked¶

concepts/hnsw-index — committed to in this post; see contradiction below.
concepts/ann-index — the generic category HNSW is an instance of.
concepts/vector-similarity-search — the query shape the index supports.
Vector / embedding primer concepts (cosine similarity; arbitrary data → ML-produced numerical representation) covered at pedagogy altitude.

Patterns invoked¶

patterns/vector-index-inside-storage-engine — this post is the first-stated commitment to this pattern from PlanetScale. The mechanism (how to actually make an ANN index live inside a storage engine) is not addressed here; later posts (2024-10-22 beta, 2025-10-01 deep-dive, 2026-03-25 GA) carry the weight.

Operational numbers¶

No numbers. Post is pre-release announcement.

Caveats¶

Announcement-altitude only. No architecture diagrams, no performance numbers, no beta date, no worked examples of SQL syntax.
Stated HNSW direction was abandoned. See Contradiction below. This post alone would mislead a reader about PlanetScale's shipped implementation.
Pedagogy half. Half the body is a general vector / embedding primer; architecturally-interesting content is roughly the second half.
Product-launch framing. Sign-up CTA (planetscale.com/ai) + close with "If you have additional questions, don't hesitate to contact us" — standard product-marketing shape with genuine architecture content layered in.
No discussion of Vitess sharding integration. Post says only that the fork will get vector support; doesn't address how sharded vector indexes would behave. Vitess sharding integration becomes a canonical selling point in the 2024-10-22 beta announcement.

Contradiction¶

HNSW as stated direction (2023-10-03) vs HNSW explicitly rejected (2024-10-22). A year after this post, PlanetScale's vectors public-beta announcement changes the algorithmic direction on structural grounds:

"HNSW has very good query performance, but struggles to scale because it needs to fit its whole dataset in RAM. Most importantly, HNSW indexes cannot be updated incrementally, so they require periodically re-building the index with the underlying vector data. This is just not a good fit for a relational database." (Source: PlanetScale 2024-10-22 vectors public beta announcement; quoted on concepts/hnsw-index.)

Implementation instead is a transactional extension of SPFresh inside InnoDB — a SPANN-family hybrid tree+posting-list index. HNSW survives in the shipped product only as (a) the in-memory head index of the hybrid, and (b) an opt-in fully-in-memory variant for users willing to accept its RAM + rebuild cost (per sources/2026-04-21-planetscale-larger-than-ram-vector-indexes-for-relational-databases).

This 2023-10-03 post is the record of the original commitment. The reversal is not acknowledged in later posts; PlanetScale simply ships the different thing. Interesting as a canonical case of research-to-production algorithm reconsideration: HNSW is correct if you ignore relational- database constraints (RAM budget, online updates, transactional semantics); SPFresh-inside-InnoDB is the answer once those constraints bind.

Source¶

systems/planetscale
systems/mysql
systems/hnsw
concepts/hnsw-index
concepts/ann-index
patterns/vector-index-inside-storage-engine
sources/2026-04-21-planetscale-larger-than-ram-vector-indexes-for-relational-databases — the 2025-10-01 engineering companion that documents the shipped (non-HNSW) architecture.