Skip to content

PLANETSCALE 2020-10-04 Tier 3

Read original ↗

PlanetScale — Pitfalls of isolation levels in distributed databases

Summary

Pedagogical-cum-architectural PlanetScale post by Sugu Sougoumarane (Vitess co-creator, PlanetScale co-founder, originally 2020-10-04, re-fetched via the 2026-04-21 feed snapshot) establishing the canonical wiki argument for minimising isolation-level strictness in distributed databases. Canvases the four ANSI-SQL-adjacent levels (Serializable, RepeatableRead, SnapshotRead/ MVCC, ReadCommitted — plus ReadUncommitted) against two representative application transactions — a bank-withdrawal (contention-required) and an international-retail-order-with-exchange-rate (contention-free) — to motivate the load-bearing claim: stricter isolation couples events that could otherwise execute independently, and in a distributed database this coupling bounds scalability. Canonical reframing: "The more loosely coupled components are in a distributed system, the better it scales. This rule applies to distributed databases, too, and the isolation level plays a big part in this." Three canonical contributions beyond the engine-pedagogy posts already on the wiki: (1) explicit reframing of isolation levels as coupling cost, not just correctness-vs-throughput; (2) introduces the ReadPartialCommits proposal — a new isolation level weaker than ReadUncommitted purpose-built for 2PC-era applications that can tolerate reading the partial effects of an in-flight distributed commit; (3) canonicalises the on-demand-Serializable- read pattern (SELECT ... FOR SHARE / FOR UPDATE at SnapshotRead or ReadCommitted baseline) as the load-bearing lever that makes the lower levels practical. Closes with the three-part prescription: prefer ReadCommitted as the baseline, avoid multi-statement transactions when possible, and avoid distributed transactions ( 2PC) by co-locating related rows on the same shard.

Key takeaways

  1. Isolation level is a distributed-systems coupling lever, not just a concurrency correctness knob. Sougou opens with the framing that reframes the whole post: "The more loosely coupled components are in a distributed system, the better it scales. This rule applies to distributed databases, too, and the isolation level plays a big part in this." Stricter levels hold locks longer, which both (a) serialises concurrent transactions that didn't need to be serialised and (b) in a cross-shard context forces the system to coordinate timestamps or locks across nodes. Canonical new concept concepts/distributed-isolation-coupling-cost. (Source: article §intro.)

  2. Serializable is theoretically pure but practically unusable, even on a single node. Sougou gives the canonical reason verbatim: "Unfortunately, Serializable is generally considered to be impractical, even for a non-distributed database. It is not a coincidence that all the existing popular databases like Postgres and MySQL recommend against it." Two failure modes: (a) excess locking — a Serializable order-creation transaction holds a read lock on the exchange_rate row for its entire lifetime (possibly many seconds, spanning inventory checks and credit checks), blocking the unrelated exchange-rate- updater process; (b) frequent deadlocks — two concurrent withdrawals each SELECT a balance (shared read lock) then try to UPDATE (upgrade to exclusive), each blocked by the other's read lock. Canonical wiki framing of Serializable's two structural failure modes. (Source: article §"Serializable".)

  3. RepeatableRead inherits Serializable's problems — and then some. "The RepeatableRead setting is an ambiguous one. This is because it differentiates point selects from searches, and defines different behaviors for each. This is not black and white, and has led to many different implementations." Sougou explicitly declines to deep-dive RepeatableRead because "as far as our use cases are concerned, RepeatableRead offers the same guarantees as Serializable and consequently inherits the same problems." (Source: article §"RepeatableRead".) Per the existing wiki coverage at concepts/mysql-transaction-isolation-levels, MySQL's RepeatableRead is specifically stronger than the SQL-standard definition via gap locks.

  4. SnapshotRead (MVCC) is excellent for read-only workloads but "no better than ReadCommitted for write workloads." Canonical framing verbatim: "For transactions that perform writes, the snapshot feature is not that useful. What you mainly want to control is whether to allow a value to change after the last read. If you want to allow the value to change, then it is going to be stale as soon as you read it because someone else can update it later. So, it doesn't matter if you read from a snapshot or get the latest value. If you do not want it to change, you want the latest value, and the row must be locked to prevent changes." Read-only use cases where SnapshotRead pays for itself: consistent multi-table reads for analytics, and Change Data Capture"You can also use the snapshot feature to read multiple tables as of a certain time, and then later observe the changes that have occurred since that snapshot. This functionality is convenient for Change Data Capture tools that want to stream changes out to an analytics database." MySQL's REPEATABLE READ is clarified as actually being SnapshotRead: "MySQL supports the SnapshotRead isolation level by default, but misleadingly calls it REPEATABLE_READ." (Source: article §"SnapshotRead".)

  5. Distributed SnapshotRead is especially costly. "Although a single database has many ways of implementing Repeatable Reads efficiently, the problem becomes more complex in the case of distributed databases. This is because transactions can span multiple shards. If so, a strict ordering guarantee must be provided by the system. Such ordering either requires the system to use a centralized concurrency control mechanism or a globally consistent clock. Both these approaches essentially attempt to tightly couple events that could have otherwise executed independent of each other." Canonical datum under concepts/distributed-isolation-coupling-cost: a distributed SnapshotRead requires either a centralized serialization point or a TrueTime-like global clock — both of which defeat the structural premise of a distributed database. (Source: article §"SnapshotRead".)

  6. ReadCommitted is the practical sweet spot because it is contention-free by default and [[patterns/ on-demand-serializable-read|upgradeable to Serializable on demand]] via locking reads. "The ReadCommitted isolation is less ambiguous than SnapshotRead because it continuously returns the latest view of the database. This is also the least contentious of the isolation levels. At this level, you may get a different value every time you read a row. … The ReadCommitted setting also allows you to upgrade your read by issuing a read or write lock, effectively providing you with the ability to perform on-demand Serializable reads. As explained previously, this approach gives you the best of both worlds for application transactions that intend to modify data. The default isolation level supported by Postgres is ReadCommitted." Canonical new pattern patterns/on-demand-serializable-read — the SELECT ... FOR UPDATE / LOCK IN SHARE MODE lever canonicalised elsewhere on the wiki as locking reads is the structural reason lower isolation levels work in practice: the application pays the locking cost only on the rows where it matters, not globally. (Source: article §"ReadCommitted".)

  7. SELECT ... FOR UPDATE (pessimistic write-lock) is preferable to LOCK IN SHARE MODE (pessimistic read-lock) for the bank-withdrawal racing case. Sougou canonicalises a subtle-but-load-bearing distinction: "This lock prevents another transaction from obtaining any kind of lock on this row. This approach of pessimistic locking sounds worse at first, but will allow two racing transactions to successfully complete without encountering a deadlock. The second transaction will wait for the first transaction to complete, at which point it will read and lock the row as of the new value." The read-lock version re-creates the Serializable deadlock shape (both transactions acquire the read lock, then race to upgrade to write). The write-lock version serialises the transactions cleanly without deadlock risk — counter-intuitively, the stronger lock is the safer choice. (Source: article §"ReadCommitted".)

  8. *ReadUncommitted is "generally considered unsafe and is not recommended for distributed or non-distributed settings. This is because you may read data that might have later been rolled back (or never existed in the first place)." Canonical wiki justification for why even the lowest ANSI-SQL level is rarely chosen in practice. (Source: article §"ReadUncommitted".)

  9. Distributed transactions ( 2PC) interact with isolation levels in ways most applications don't need. Sougou enumerates the cost: "In a distributed system, if two rows are in different shards or databases, and you want to atomically modify them in a single transaction, you incur the overhead of a two-phase commit (2PC). This requires substantially more work: Metadata about the distributed transaction is created and saved to durable storage. A prepare is issued to all individual transactions. A decision to commit is saved to the metadata. A commit is issued to the prepared transactions." Then: "A distributed transaction also interacts with the isolation level. For example, let us assume that only the first commit of a 2PC transaction has succeeded and the second commit is delayed. If the application has read the effects of the first commit, then the database must prevent the application from reading the rows of the second commit until completion. Flipping this around, if the application has read a row before the second commit, then it must not see the effects of the first commit." Canonical framing of how 2PC forces extra visibility-coordination work beyond the commit itself. Availability consequence: "excessive use of 2PC reduces the overall availability and latency of a system. This is because your effective availability will be dictated by the worst performing shard." (Source: article §"Distributed transactions".)

  10. Proposal: a new isolation level, ReadPartialCommits. Canonical contribution — Sougou argues for a new named level, weaker than ReadUncommitted, that lets 2PC-era applications opt out of the in-flight- distributed-commit-visibility machinery: "What if the application could tolerate these partial commits? Then we are doing unnecessary work that the application doesn't care about. It may be worth introducing a new isolation level like ReadPartialCommits. Note that this is different from ReadUncommitted where you may read data that may eventually be rolled back." The distinction from ReadUncommitted is load-bearing: ReadPartialCommits only exposes committed rows, but does not wait for the other participants of a distributed transaction to commit before exposing the locally-committed rows. Partial rather than uncommitted. (Source: article §"Distributed transactions".)

  11. The prescription (canonical): "To be scalable, an application should avoid relying on any advanced isolation features of a database. It should instead try to use as few of the guarantees as it can. If you can write an application to work with ReadCommitted isolation level, then moving to SnapshotRead should be discouraged. Serializable or RepeatableRead are almost always a bad idea. It is also better to avoid multi-statement transactions if possible. … If using a sharded database, avoid distributed transactions. This can be achieved by keeping related rows within the same shard." Canonicalised as patterns/lowest-isolation-plus-explicit-locks. Connects to the already-canonical ordered-commit- without-2PC primitive (Vitess Consistent Lookup Vindex, 2026-04-21 same-day ingest) as the shard-topology lever. (Source: article §"Conclusion".)

  12. Meta-principle: "this is something that one must do from the beginning, because it is very hard to refactor a non-concurrent program to be concurrent." Sougou explicitly flags this as an exception to the "no premature optimisation" rule. Concurrency constraints propagate through the application at design time and are not localised refactorings. (Source: article §"Conclusion".)

Use cases worked in the article

Bank withdrawal (contention-required)

BEGIN TRANSACTION
  balance := SELECT balance FROM user WHERE id = ?
  INSERT INTO activity (...) VALUES (?, ?)
  UPDATE user SET balance = balance - amount WHERE id = ?
COMMIT

The balance must not change between the read and the write. Per-isolation behaviour:

  • Serializable: Correct. Deadlock-prone under concurrent withdrawals on the same account.
  • SnapshotRead: Unsafe without an explicit lock — the snapshot may be stale. Fix: SELECT ... LOCK IN SHARE MODE (MySQL) or SELECT ... FOR UPDATE. Sougou recommends FOR UPDATE to avoid the Serializable deadlock shape.
  • ReadCommitted: Same — unsafe without lock; correct with SELECT ... FOR UPDATE.

International retail order (contention-free)

BEGIN TRANSACTION
  rate := SELECT rate FROM exchange_rate WHERE currency = ?
  -- possibly long-running: inventory check, credit check
  INSERT INTO order (...) VALUES (?, ?, ?, ?)
COMMIT

A separate process continuously updates exchange_rate. The order transaction does not need the rate to stay stable. Per-isolation behaviour:

  • Serializable: The exchange-rate updater is blocked for the entire lifetime of the order transaction because of the lock held on the exchange_rate row. Unnecessary coupling — the application doesn't care about the rate's stability post-read.
  • SnapshotRead: No contention. Read returns the rate as of the snapshot; concurrent updates to exchange_rate proceed freely.
  • ReadCommitted: No contention. Read returns the latest committed rate; concurrent updates proceed freely.

Canonical framing: the Serializable overhead on the retail case is pure tax — the correctness property it provides is one the application does not require, and the cost is blocking an unrelated concurrent process.

Operational numbers / architectural datums

The post is theoretical and contains no production telemetry. Named architectural constants:

Caveats

  • Pedagogy-plus-proposal voice, not production retrospective — no Vitess-specific performance numbers, no customer workload before/after data, no measured deadlock rates on production MySQL under the four levels. The canonical framings are qualitative.
  • RepeatableRead is explicitly not deep-dived ("not black and white, and has led to many different implementations"). Sougou collapses it into "basically Serializable from the application's point of view," which is accurate for MySQL-InnoDB but misleading for the SQL-standard definition (SQL-standard RepeatableRead permits phantoms, Serializable does not — see the already-canonical concepts/phantom-read on the wiki).
  • SnapshotRead's write-skew anomaly is not named. SI is presented as "like ReadCommitted for writes" — accurate for the two worked use cases, but SI also permits write-skew anomalies that neither ReadCommitted nor Serializable exhibits. Applications moving from Serializable to SI should audit for write-skew separately.
  • ReadPartialCommits is a proposal, not a shipped level. No database-vendor has shipped this level as of the 2026-04-23 ingest date. Canonical wiki framing: this is a design-space pointer for distributed-database operators facing 2PC's visibility-coordination tax, not a usable isolation mode.
  • Lost-update anomaly not explicitly named. The withdrawal worked example is a classic lost-update shape. Sougou frames the FOR UPDATE fix correctly without using the Berenson-et-al terminology.
  • Post predates the same-day 2026-04-21 ingest sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex by ~6 years but aligns architecturally: the consistent- lookup-vindex post canonicalises the ordered-commit- without-2PC shard-topology mechanism that Sougou's §"Distributed transactions" recommends at the application altitude ("keep related rows within the same shard").
  • 2020-10-04 original publication re-surfaced via the 2026-04-21 feed snapshot. The prescription ("avoid distributed transactions") was consistent with Vitess's 2PC support being explicitly experimental at the time; the 2026-04-21 Vitess 21 release notes sources/2026-04-21-planetscale-announcing-vitess-21 re-ship atomic distributed transactions with a revamped design that is still experimental — so the prescription remains current six years later.

Source

Last updated · 378 distilled / 1,213 read