Skip to content

PlanetScale — Achieving data consistency with the consistent lookup Vindex

Summary

PlanetScale / Vitess engineering (Harshit Gangal and Deepthi Sigireddi, two core Vitess maintainers) walk through how Vitess's Consistent Lookup Vindex maintains a consistent view across a sharded MySQL data table and its cross-shard lookup Vindex tables without using 2PC (Two-Phase Commit) for every DML. The mechanism is a careful locking + ordered-commit protocol over three MySQL connections per DML statement — Pre, Main, Post — committed in that order, with rollback in the reverse order on failure. The guarantee provided is weaker than full atomicity of the data-row write and the lookup-row write: lookup rows may be inconsistent with the user table, but every user-facing query returns results consistent with the user table.

Key takeaways

  • Vindexes are Vitess's sharding-and-routing primitive: they map a user row to a keyspace_id (the shard address). Primary Vindexes decide where a row lives (they drive sharding); Secondary Vindexes decide where a row is found by a column other than the sharding key, by materialising a column → keyspace_id lookup table (Source: sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex).

  • The cross-shard-transaction problem: the user data table and each lookup Vindex table are sharded by different columns (user.id, name_user_vdx.name, phone_user_vdx.phone), so one logical DML may land on different shards for the data and lookup rows. "To maintain consistency between the user data table and the Vindex tables, all these operations will need to occur in a transaction that spans multiple shards. This means we need to implement a costly protocol like 2PC (Two Phase Commit) to guarantee Atomicity and Isolation (A and I from ACID). Not using a proper multi-shard transaction for these operations can lead to partial commit and inconsistent data."

  • Consistent Lookup Vindex avoids 2PC via careful locking and transaction sequences. Vitess uses three connections per DML — Pre, Main, Post — with a fixed commit order: PreMainPost. On rollback, the remaining open transactions are rolled back in the same order they would have committed in. See patterns/ordered-commit-without-2pc.

  • Delete flow (delete from user where id = 100):

  • Main: select id, name, phone from user where id = 100 for update
  • Post: delete from name_user_vdx where name='Alex' and id=100 + delete from phone_user_vdx where phone=8877991122
  • Main: delete from user where id = 100

On commit, PreMainPost. If Main succeeds and Post fails, the lookup Vindex rows remain pointing at a now-deleted user row. The user-facing query select count(*) from user where name='Alex' still returns 0: lookup resolves Alex → keyspace_id 0x313030, query routes to that shard, but no matching row exists on that shard. "The lookup Vindex table may be inconsistent with the user table but the results returned for the query remained consistent with the user table." This is the system's central invariant, and it explains why 2PC is avoidable — Vitess is willing to tolerate orphan lookup rows (concepts/orphan-lookup-row) because the data table is authoritative.

  • Insert flow (insert into user(300, 'Emma', 8877991122, 'xyz@mail.com')):
  • Pre: insert into name_user_vdx ... (non-unique — no error) + insert into phone_user_vdx(phone=8877991122 ...) (unique — may hit a duplicate-key error from a leftover orphan).
  • Main: insert into user(300, 'Emma', 8877991122, ...).

  • Duplicate-key error recovery (the canonical orphan-row handling): when step 1's unique-column insert collides with a leftover orphan from a prior failed delete, Vitess does not fail the DML. Instead it:

  • Pre: select phone, keyspace_id from phone_user_vdx where phone=8877991122 for updaterow-lock the orphan.
  • Main: select phone from user where phone=8877991122 for update — verify no user row exists.
  • Pre: update phone_user_vdx set keyspace_id='0x333030' where phone=8877991122repoint the orphan at the new row.

The for update locks on both the lookup-row and the (absent) user-row prevent races with concurrent DMLs. The verification step is what makes the reclamation safe: the algorithm repoints an orphan only if the data table confirms the row it points to doesn't exist.

  • Update Vindex column = delete + insert: updates to a Vindex column are implemented as a delete followed by an insert. Delete rides Post, insert rides Pre.

  • No-op optimisation for identity updates: if the DML would update a Vindex column to its existing value, Vitess turns the Vindex update into a no-op. Without this, the Post-connection delete would hold a row lock on the lookup-table row, and the Pre-connection insert would lock wait timeout trying to re-insert it — the same transaction deadlocking against itself across two connections.

  • Known limitation: the identity-update no-op isn't enough if the same Vindex value is deleted and inserted as two separate statements in the same transaction. That case can still hit lock wait timeout between the Post connection's delete-row-lock and the Pre connection's re-insert. "However, it is still possible to run into this limitation if the same lookup Vindex value is deleted and inserted as two different statements inside the same transaction."

Systems extracted

  • systems/vitess — the sharding layer providing the Consistent Lookup Vindex mechanism, VTGate query planning, and VTTablet connection management.
  • systems/mysql / systems/innodb — the host database engine providing row-level locking (lock_mode X locks rec but not gap) and transactional INSERT / UPDATE / DELETE semantics that the three-connection protocol composes over.
  • systems/planetscale — the commercial managed-Vitess product where Consistent Lookup Vindex is a production customer-facing feature.

Concepts extracted

  • concepts/vindex"Vitess uses Vindexes (short for Vitess Index) to associate rows in a table with a designated address known as Keyspace ID." Primary Vindex drives the shard-placement of a table; Secondary Vindex is a global index materialising a column → keyspace_id map for non-sharding-key lookups.
  • concepts/consistent-lookup-vindex — a Secondary Vindex variant that gives consistent user-facing query results (not full ACID across the data table + lookup table) without requiring 2PC.
  • concepts/keyspace-id — Vitess's shard-address identifier: a binary(8) value per row that the router uses to send a query to its destination shard.
  • concepts/orphan-lookup-row — a lookup-Vindex row whose referenced user row no longer exists (e.g. left behind when a Post-connection delete on the lookup failed after the Main delete on the user table succeeded). Not a correctness bug: user-facing queries stay consistent because they always consult the user table. Cleaned up on-demand by the duplicate-key-reclamation path at insert time.
  • concepts/secondary-index — the relational-engine building block that Vindex tables are, when observed from inside a single shard; phone_user_vdx is literally a MySQL table with phone as primary key.
  • concepts/row-level-lock-contention — the for update locks used by the duplicate-key-reclamation path and the Main select … for update scan inherit InnoDB's row-level locking semantics; the identity-update → no-op optimisation exists specifically to avoid deadlocking on them.

Patterns extracted

  • patterns/ordered-commit-without-2pcPre / Main / Post connection ordering with asymmetric rollback as an alternative to 2PC for the specific case where one participant (the user data table) is authoritative and user-facing queries route through it. Weaker guarantee than 2PC (orphan lookup rows are possible), but dramatically cheaper on the happy path (no prepare phase, no coordinator round-trip) and self-healing on the error path (orphans are reclaimed lazily on the next insert).

Operational numbers

  • None disclosed in this post. No QPS / latency / failure rate measurements. The post is a mechanism disclosure, not a production retrospective.

Caveats

  • Post is tutorial-voice: it walks through a minimal worked example with one user table and two lookup tables (name_user_vdx non-unique, phone_user_vdx unique). Real Vitess deployments may have many more lookup Vindexes per table + compound-key lookups, and the post doesn't discuss how the three-connection protocol scales with lookup count (is it three connections per DML or three connections per Vindex-touched?).

  • No atomicity guarantee spelled out: the post asserts the user-query-consistency invariant but doesn't formally state what isolation level user-facing reads see, or how long orphans can persist in the worst case. In pathological deployments (write-heavy + high Post failure rate), orphan count could grow.

  • The deadlock limitation is real: "it is still possible to run into this limitation if the same lookup Vindex value is deleted and inserted as two different statements inside the same transaction." Customers must structure DML not to trip this; the limitation isn't hidden but isn't fully specified either (exact lock-wait timeout, recovery path).

  • No comparison to 2PC cost numbers: the post asserts 2PC is "costly" but doesn't quantify in microseconds or QPS reduction. The argument is qualitative.

  • No failure-mode enumeration: the Main succeeds + Post fails path is walked through; Pre succeeds + Main fails and Pre fails are not analysed.

  • Tier-3 pedagogy post — from the PlanetScale / Vitess-core voice, high architectural density but not a war story. Paired with the 2026-02-16 zero-downtime migrations post, this is the second canonical Vitess-internals disclosure on the wiki from Vitess core maintainers (Matt Lord was the author of that post; this one is Harshit Gangal + Deepthi Sigireddi). Extends the wiki's Vitess-subsystem coverage from the data-motion layer (VReplication / VDiff / MoveTables) to the query-routing + transactional-write layer (Vindexes / Consistent Lookup Vindex).

Source

Last updated · 319 distilled / 1,201 read