Skip to content

CONCEPT Cited by 1 source

Orphan lookup row

An orphan lookup row is a row in a Vindex lookup table whose referenced user row no longer exists — typically because the Consistent Lookup Vindex's Post connection failed to delete the lookup row after the Main connection successfully deleted the user row.

Why orphans are not a correctness bug

Vitess's Consistent Lookup Vindex explicitly tolerates orphans because the user data table is authoritative for user-facing queries. A query like SELECT COUNT(*) FROM user WHERE name='Alex' resolves:

  1. Lookup: name_user_vdx says Alex → keyspace_id 0x313030.
  2. Route: query sent to the shard owning 0x313030.
  3. Filter: that shard's user table has no row matching — returns 0.

The lookup row is stale (points at a row that was deleted) but the result returned is consistent with the user table:

"The lookup Vindex table may be inconsistent with the user table but the results returned for the query remained consistent with the user table."

sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex

When orphans are created

  • Delete flow partial failure: Main commits the user row delete, Post fails before it commits the lookup-row deletes. The user row is gone; the lookup rows remain.
  • Vindex-column update partial failure: an update is implemented as delete-then-insert; if Post commits (delete on old lookup value) but Pre fails before commit (insert of new lookup value), the lookup table is in an inconsistent state with respect to the user row — a variant of the orphan problem.

When orphans are reclaimed (lazy cleanup)

Orphans are reclaimed lazily on the next insert that hits the same unique lookup-column value. The sequence (for phone_user_vdx where phone is a unique Secondary Vindex):

  1. New insert: insert into phone_user_vdx(phone=8877991122, keyspace_id='0x333030')Duplicate key error.
  2. Pre: select phone, keyspace_id from phone_user_vdx where phone=8877991122 for update — row-lock the orphan.
  3. Main: select phone from user where phone=8877991122 for update — verify no user row owns this phone.
  4. Pre: update phone_user_vdx set keyspace_id='0x333030' where phone=8877991122 — repoint the orphan at the new row.

The verification step (step 3) is what makes reclamation safe: the orphan is repointed only if the authoritative data table confirms the row it points to doesn't exist. A concurrent insert that did re-create a user row with the same phone would hold the for update lock first and serialise.

Why non-unique Secondary Vindexes don't need this recovery path

The reclamation path only runs on unique lookup columns, because only uniqueness constraints generate duplicate-key errors. Non-unique lookup rows (like name_user_vdx where multiple users can share a name) accumulate orphans on delete-partial-failure too, but no insert collides with them. They persist as dead rows until garbage-collected out of band (Vitess doesn't document a scrubber in this post — whether orphans on non-unique Vindexes are ever reclaimed is a caveat).

Garbage-collection / scrubbing

The post doesn't describe an active scrubber. Orphans on unique lookup tables self-heal on the next colliding insert; orphans on non-unique lookup tables potentially live forever unless cleaned up by an out-of-band job. In pathological deployments with high Post-failure rate and heavy non-unique-Vindex usage, orphan count could grow unbounded — a caveat worth flagging for any production deployment.

Seen in

Last updated · 319 distilled / 1,201 read