Skip to content

CONCEPT Cited by 1 source

Transactional vector index

Definition

A transactional vector index is an ANN index whose mutations obey the hosting database's transactional semantics: inserts / updates / deletes land in the index atomically with the row's SQL commit, rollbacks undo the index change alongside the row change, the index survives crashes with the same consistency guarantees as the rest of the table, and the index does not require periodic rebuilds.

Why it's hard

Mainstream ANN indices were designed as read-heavy sidecars with bulk-rebuild or loose-incremental semantics:

  • HNSW — RAM-bound, no incremental reorganisation.
  • DiskANN — SSD-resident, but incremental update is inefficient and "hard to map to transactional SQL semantics" (PlanetScale).
  • SPANN as published — offline posting-list reorganisation.

Making an ANN index transactional means: every mutation is a commit-gated operation in the host engine's transaction log; rollback of a commit must undo the index change; crash recovery must replay index changes with the rest of the WAL; concurrent background maintenance ops must interleave safely with user transactions.

Canonical wiki instance

PlanetScale's 2024-10-22 vectors public-beta announcement introduces an extension of SPFresh that adds transactional support to all SPFresh operations and integrates it inside InnoDB. PlanetScale verbatim:

"inserts, updates, and deletes of vector data are immediately reflected in the vector index as part of committing your SQL transaction, and follow the same transactional semantics, including support for batch commits and rollbacks."

sources/2024-10-22-planetscale-planetscale-vectors-public-beta

The index "survives process crashes with strong consistency guarantees" because it inherits InnoDB's crash recovery; it "does not need to be periodically rebuilt" because SPFresh's background maintenance is continuous; and it "scale[s] all the way into terabytes, just like any other MySQL table" because it rides InnoDB's page storage.

Why the shape matters

A transactional vector index is the architectural prerequisite for treating a vector column as a first-class relational primitive — usable inside JOIN / WHERE / subqueries, composable with transactional business logic, sharded via the existing horizontal-sharding layer (Vitess), accessible via standard MySQL drivers. Without this property, vector search is architecturally a sidecar.

Seen in

Last updated · 319 distilled / 1,201 read