Skip to content

CONCEPT Cited by 1 source

Consistent Lookup Vindex

A Consistent Lookup Vindex is a Vitess Secondary Vindex variant that keeps user-facing query results consistent with the authoritative data table without running 2PC (Two-Phase Commit) on every DML. It does this by ordering commits across three MySQL connections (Pre, Main, Post) and tolerating orphan lookup rows that user queries can never observe.

The problem it solves

Vitess shards the user data table by its Primary-Vindex column (user.id) and shards each Lookup-Vindex table by the lookup column (name_user_vdx by name, phone_user_vdx by phone). One logical DML writes rows on different shards for different columns, so the writes are cross-shard.

"To maintain consistency between the user data table and the Vindex tables, all these operations will need to occur in a transaction that spans multiple shards. This means we need to implement a costly protocol like 2PC (Two Phase Commit) to guarantee Atomicity and Isolation (A and I from ACID). Not using a proper multi-shard transaction for these operations can lead to partial commit and inconsistent data."

sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex

The central invariant (weaker than 2PC, but sufficient)

Consistent Lookup Vindex does not guarantee atomicity between the user-table write and the lookup-table write. It guarantees only:

"The lookup Vindex table may be inconsistent with the user table but the results returned for the query remained consistent with the user table."

This works because the user table is authoritative for user-facing queries. The lookup table's job is only to route the query to the right shard; once the query lands on that shard, the final filter on the user table gives the correct answer. An orphan lookup row just causes a query to visit a shard where the row no longer exists — the query returns zero rows, consistent with the user table.

The three-connection protocol

Vitess opens three MySQL connections per DML:

  • Pre: lookup-table inserts (and reclamation updates).
  • Main: user-table DML (the authoritative write).
  • Post: lookup-table deletes.

On commit: PreMainPost. On failure: rollback in the same order the open transactions would have committed in. See patterns/ordered-commit-without-2pc for the full pattern treatment.

Delete flow

delete from user where id = 100:

  1. Main: select id, name, phone from user where id=100 for update — lock + read the soon-to-be-deleted lookup columns.
  2. Post: delete from name_user_vdx where name='Alex' and id=100 + delete from phone_user_vdx where phone=8877991122.
  3. Main: delete from user where id = 100.

If Main succeeds and Post fails, the user row is gone but lookup rows remain (orphans). User-facing queries (select count(*) from user where name='Alex') resolve Alex → keyspace_id 0x313030, route to that shard, find no matching user row, return 0. Correct result, even though the lookup table is stale.

Insert flow

insert into user(300, 'Emma', 8877991122, 'xyz@mail.com'):

  1. Pre: insert into name_user_vdx(name, id, keyspace_id) values('Emma', 300, '0x333030') — no error (name is non-unique).
  2. Pre: insert into phone_user_vdx(phone, keyspace_id) values(8877991122, '0x333030') — may hit Duplicate key error if an orphan exists for this phone.
  3. Main: insert into user(300, 'Emma', 8877991122, ...).

Duplicate-key recovery: orphan-row reclamation

When step 2 collides with an orphan, Vitess does not fail the DML. It:

  1. Pre: select phone, keyspace_id from phone_user_vdx where phone=8877991122 for updaterow-lock the orphan to block concurrent DMLs.
  2. Main: select phone from user where phone=8877991122 for update — verify no user row exists with this phone.
  3. Pre: update phone_user_vdx set keyspace_id='0x333030' where phone=8877991122repoint the orphan at the new row.

The double for update (on the orphan row and on the absent user row) is what makes reclamation safe: the algorithm repoints an orphan only if the authoritative table confirms the row it points to doesn't exist.

Update flow

Vindex-column updates are implemented as delete + insert: delete rides Post, insert rides Pre.

Identity-update no-op (and its limitation)

Special case: if the update would set a Vindex column to its existing value, Vitess turns it into a no-op. Reason: without this optimisation, the Post connection would hold an X row-lock on the lookup-table row from the delete, and the Pre connection would lock wait timeout trying to re-insert the same row. The transaction would deadlock against itself across two of its own connections.

The limitation:

"However, it is still possible to run into this limitation if the same lookup Vindex value is deleted and inserted as two different statements inside the same transaction."

Customers are expected not to structure DML this way, but the corner case isn't recovered automatically.

Cost vs 2PC

The post argues the three-connection protocol is dramatically cheaper on the happy path than 2PC (no PREPARE phase, no transaction-coordinator round-trip) and self-healing on the error path (orphans are lazily reclaimed on next insert). No microsecond or QPS numbers are disclosed, but the design tension is a correctness-guarantee weakening in exchange for coordinator-free DML cost — well-established in the distributed-database literature.

Seen in

Last updated · 319 distilled / 1,201 read