Skip to content

PLANETSCALE 2025-03-25

Read original ↗

PlanetScale — PlanetScale vectors is now GA

Summary

Six months after the public-beta announcement, PlanetScale GA-releases its in-MySQL vector search / storage. The post confirms the architectural picture from the beta (SPANN + SPFresh transactionally integrated inside InnoDB, composing with Vitess sharding and branch-based workflows) and adds five substantive disclosures that were deferred at beta:

  1. Quantitative improvements since beta — 2× query performance, 8× better memory efficiency.
  2. Operational larger-than-RAM ceiling — indexes "perform well even when they are 6× larger than available memory" (new concrete datum; the beta only asserted "larger than RAM" qualitatively).
  3. Distance metrics supported — Euclidean (L2), inner product, and cosine (previously undisclosed).
  4. Dimension ceiling — up to 16,383 dimensions per vector (previously undisclosed).
  5. Quantization options — both fixed and product quantization, down to 1 bit per field in the fixed- quantization extreme (previously undisclosed; PlanetScale characterises 1-bit as "crazy fast, or just crazy, depending on your needs").

The post also makes an unambiguous first-RDBMS claim for SPANN: PlanetScale is "the first to incorporate an index based on SPANN … into an RDBMS." And it names PlanetScale Metal as the ideal substrate: "Using vector indexes on PlanetScale Metal ensures that loading vector partitions from InnoDB to answer queries will be as fast as possible" — the first wiki disclosure that SPANN posting-list I/O is the specific workload Metal accelerates for vector indexes.

Key takeaways

  1. GA, not beta — SPANN in an RDBMS is a shipped product. The headline "we are excited to announce that PlanetScale's support for vector search and storage is GA" confirms production readiness; the beta's "we will continue to improve performance leading up to GA" caveat is now retired. Vector indexes are "ready for use in production today."

  2. 2× query perf + 8× memory efficiency improvements since beta. Verbatim: "we have doubled query performance, improved memory efficiency eight times, and focused on robustness to ensure vector support is as solid as every other data type MySQL supports." No absolute numbers disclosed — this is a beta-to-GA delta claim, not a cross-vendor benchmark.

  3. 6× larger-than-RAM operational ceiling. The beta asserted larger-than-RAM qualitatively ("designed to work well for larger-than-RAM indexes that require SSD usage"). The GA post makes the claim concrete: "vector indexes … now perform well even when they are 6× larger than available memory." This is the first wiki datum on how far beyond RAM the SPANN + InnoDB composition actually scales in practice. See concepts/larger-than-ram-vector-index.

  4. First RDBMS with SPANN. "We are the first to incorporate an index based on SPANN (Space-Partitioned Approximate Nearest Neighbors) into an RDBMS." The post links the original SPANN paper directly — Chen et al., NeurIPS 2021. The canonical wiki-side citation for this claim is the systems/spann page.

  5. Distance metrics disclosed: Euclidean, inner product, cosine. From the beta's open "which distance metrics are supported?" gap, the GA post answers: "An index can rank vectors by Euclidean (L2), inner product, or cosine distance." Matches the standard vector-search triplet; comparable to S3 Vectors which at preview supported only Cosine + Euclidean. Inner-product support here is notable for models trained with dot- product similarity (e.g. many recent embedding models optimise cosine and inner-product identically when vectors are unit-normalised, but inner-product support matters for non-normalised embeddings).

  6. Dimension ceiling: 16,383. "It can store any vector up to 16,383 dimensions." Answers the beta's undisclosed dimension ceiling. Well above common embedding sizes (OpenAI text-embedding-3-large = 3,072 dims; Voyage 3 Large = 2,048; CLIP = 512–1,024; OpenAI text-embedding-3-large = 3,072). Headroom is comfortable.

  7. Quantization: fixed + product, down to 1 bit per field. "It supports both fixed and product quantization. Fixed quantization down to one bit per field is crazy fast, or just crazy, depending on your needs." Fixed quantization (= scalar quantization per field) and product quantization (codebook-based) are both standard techniques (see concept page); the 1-bit-per-field fixed extreme is the most aggressive memory-footprint setting and PlanetScale's own framing acknowledges the recall trade-off. No recall numbers disclosed.

  8. Full database operational features work on vectors — backups, branching, replicas. "All the PlanetScale features you rely on work with vectors, too. Create a branch. Add a vector index. Open a deploy request and merge the index into your production branch. Revert it if you change your mind. Query your vector index on other replicas, even in other regions. Sleep easy knowing that your vector data is included in scheduled backups." Concrete confirmation that vector indexes compose with:

  9. branching
  10. deploy requests
  11. schema reverts
  12. cross-region read replicas (via PlanetScale Portals-style topology implied)
  13. scheduled backups (index data is included) — operational scope that pure-sidecar vector stores (Pinecone, Weaviate, Vectorize, S3 Vectors) don't compose with the host relational database's lifecycle.

  14. SPANN architecture recap — and why it composes with InnoDB. "In SPANN, vectors are assigned to small partitions, which are stored in hidden InnoDB tables. One vector from each partition, around 20% of the index, is stored in a tree structure in memory, enabling the index to quickly identify which partitions are relevant to a query. Only the tree and a small, fixed number of relevant partitions need to be in memory when building and querying the index." Two concrete datums new to the wiki:

  15. Posting lists are stored in "hidden InnoDB tables" — the integration shape is: SPANN's posting lists become actual InnoDB tables hidden from the user, giving them transactional semantics, buffer-pool caching, crash recovery for free. This is the how of vector-index-inside-storage-engine specific to PlanetScale's implementation.
  16. ~20% of the index is the in-memory tree. "One vector from each partition, around 20% of the index, is stored in a tree structure in memory." First concrete datum on SPANN's memory footprint ratio in PlanetScale's implementation. The 80/20 split (20% of the index in RAM as tree, 80% on SSD as posting lists) is the mechanism enabling the 6× larger-than- RAM claim.

  17. PlanetScale Metal is the ideal substrate. "Using vector indexes on PlanetScale Metal ensures that loading vector partitions from InnoDB to answer queries will be as fast as possible." First wiki disclosure that Metal's direct- attached-NVMe substrate specifically accelerates SPANN posting-list loads — the on-SSD partition reads are the SPANN query-path hot spot. This is a concrete composition: Metal's ~50 μs local NVMe round-trip (vs ~250 μs for EBS) directly compresses the SPANN partition-load latency floor.

  18. Per-branch enablement unchanged from beta. "go to the 'Branches' page for any database. Click on a branch you want to add vectors to, and click on the small gear icon … Click the toggle next to 'Enable vectors.'" The per-branch-enrolment model from beta carries into GA — still an opt-in toggle per branch rather than a cluster-wide default.

Architectural numbers

  • Query perf (GA vs beta): 2× improvement (no absolute QPS numbers).
  • Memory efficiency (GA vs beta): 8× improvement (no absolute MB/vector numbers).
  • Larger-than-RAM ceiling: 6× RAM (perform well at 6× memory; not a hard ceiling, operational working claim).
  • In-memory tree footprint: ~20% of index total (one vector per partition).
  • Dimension ceiling: 16,383 dimensions.
  • Distance metrics: Euclidean (L2), inner product, cosine.
  • Quantization floor: 1 bit per field (fixed quantization extreme).
  • Quantization modes: fixed + product.
  • Target hardware: Metal (direct-attached NVMe) explicitly named as the preferred substrate.
  • Latencies / p99 / build times / recall numbers: not disclosed.
  • Production scale numbers (QPS / corpus size / TB vector indexes): not disclosed.

Systems touched

  • systems/planetscale — GA product surface; per-branch enrolment continues; vector indexes now ride the full feature set (branching, deploy requests, reverts, backups, cross-region replicas).
  • systems/planetscale-metal — ideal substrate for SPANN posting-list I/O. First wiki datum on Metal-accelerates- SPANN coupling.
  • systems/vitess — sharding layer still composes with transactional SPFresh indexes.
  • systems/mysql — host SQL surface.
  • systems/innodb — storage engine hosting the vector index; the GA post newly discloses that SPANN posting lists are "hidden InnoDB tables".
  • systems/spann — algorithmic basis; first-RDBMS claim made here. Paper link embedded in post (arXiv 2111.08566).
  • systems/spfresh — concurrent background maintenance layer; transactional extension runs inside InnoDB.

Concepts touched

Patterns touched

  • patterns/vector-index-inside-storage-engine — the GA post confirms the pattern with new implementation detail (hidden InnoDB tables for posting lists; ~20% in-memory tree footprint).
  • patterns/hybrid-tree-graph-ann-index — the GA post clarifies that SPANN in PlanetScale is tree-only for partition identification ("One vector from each partition … is stored in a tree structure"). The beta had described SPANN as "tree + graph"; this GA post uses "tree" language without explicitly invoking a graph over centroids. Whether PlanetScale's implementation uses a centroid-graph or a pure tree for partition identification is a small ambiguity introduced at GA (see Caveats / gaps below).

Caveats / gaps

  • Tree-vs-tree-and-graph ambiguity at GA. The beta post described SPANN as "a hybrid vector indexing and search algorithm that uses both graph and tree structures." This GA post describes the in-memory structure as "one vector from each partition … stored in a tree structure." It's unclear whether (a) PlanetScale uses only a tree over centroids in its implementation (a simplification of stock SPANN), or (b) "tree structure" is informal prose for the SPANN centroid graph + tree combination, or (c) the implementation evolved since beta. The stock SPANN paper's own index structure is tree + graph; SPFresh extends it with background ops. The systems/spann and patterns/hybrid-tree-graph-ann-index pages retain the tree + graph framing as the canonical algorithmic description per Microsoft Research, noting the PlanetScale GA post's simpler "tree" language.
  • No absolute performance numbers disclosed. The 2× query / 8× memory claims are beta-to-GA deltas, not cross-vendor numbers. No QPS, p99, recall @ K, corpus size, or index build times published.
  • No recall numbers at quantization levels. The 1-bit fixed quantization is characterised qualitatively ("crazy fast, or just crazy") without a recall vs bit-width curve.
  • No SPFresh–user-transaction serialisation detail. The open question from the beta — how does concurrent SPFresh background maintenance interleave with user transactions? how are in-flight partial SPFresh ops rolled back? — is still deferred at GA.
  • Sharded vector index top-K behaviour still not disclosed. Vitess + SPFresh cross-shard top-K query semantics (scatter-gather? per-shard top-K then merge? recall implications?) unchanged from the beta gap.
  • Metal pricing delta for vector workloads not disclosed. Post recommends Metal for vector indexes without quantifying the cost delta.
  • No head-to-head benchmark against Pinecone, Weaviate, pgvector, or other competing vector stores. GA remains a launch post, not a comparative benchmark piece.
  • Raw post body mentions a YouTube video + linked documentation whose contents aren't included in the ingest. Any numbers or architecture detail in those deferred.

Source

Last updated · 470 distilled / 1,213 read