Skip to content

PLANETSCALE 2024-01-08 Tier 3

Read original ↗

PlanetScale — MySQL isolation levels and how they work

Summary

Pedagogical PlanetScale post by Brian Morrison II (2024-01-08, re-surfaced via the 2026-04-21 feed snapshot) canonicalising the MySQL transaction-isolation-level model end-to-end: the four levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE); the three classic read phenomena / violations they each permit or forbid (dirty read, non-repeatable read, phantom read); the two underlying lock primitives (shared + exclusive) and the InnoDB-specific gap-locking mechanism used under REPEATABLE READ to prevent phantom reads; and locking reads (SELECT … FOR SHARE and SELECT … FOR UPDATE) as the developer-facing levers for upgrading an otherwise-non-locking read into a shared or exclusive lock on the fly. Frames the whole discussion under the I of the ACID acronym (Reuter & Härder, 1983) and motivates each level via a running CD-store worked example. Canonical datums: REPEATABLE READ is the MySQL default; InnoDB uses MVCC to give REPEATABLE READ a pinned-transaction-ID snapshot whereas READ COMMITTED creates a fresh snapshot per SELECT; gap locks use WHERE predicates to lock the space around matching rows, preventing concurrent inserts that would alter a re-run's results; SERIALIZABLE is REPEATABLE READ + an implicit shared lock on every SELECT, with attendant deadlock risk. Closes with the three mechanisms for setting isolation (SET TRANSACTION ISOLATION LEVEL … per-session, SET GLOBAL TRANSACTION ISOLATION LEVEL … cluster-wide for new sessions with CONNECTION_ADMIN, and --transaction-isolation=… / transaction-isolation = … at server startup).

Key takeaways

  1. MySQL's default isolation level is REPEATABLE READ, not READ COMMITTED. This diverges from PostgreSQL and SQL Server, which both default to READ COMMITTED. "Repeatable read is the default MySQL isolation level used for all connections unless configured otherwise." The practical consequence: on MySQL, every transaction implicitly gets a pinned snapshot and the stronger non-repeatable-read + phantom-read protection, even for code that didn't ask for them. Moving an application between engines without re-auditing transaction semantics is a known production footgun. (Source: article §"Repeatable read".)
  2. Three violations define the hierarchy. Isolation levels are characterised by which read phenomena they permit: dirty reads (reading uncommitted data from another transaction that may be rolled back), non-repeatable reads (the same SELECT returning different values within one transaction because another transaction committed a modification in between), and phantom reads (the same SELECT returning a different number of rows because another transaction inserted matching rows in between). The four isolation levels form a monotonic ladder of strictness — each higher level forbids everything the lower levels forbid plus one more class.
Level Dirty Non-repeatable Phantom
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ ✗*
SERIALIZABLE

*MySQL's REPEATABLE READ prevents phantom reads via gap locking on locking reads — this is a MySQL-specific strengthening of the SQL-standard definition, where REPEATABLE READ only forbids the first two. 3. InnoDB's REPEATABLE READ is MVCC-based with a pinned transaction ID. "With InnoDB's MVCC (Multi Version Concurrency Control), a row may have multiple versions at any given time, depending on open transactions. In repeatable read level, queries will use a consistent snapshot of the data, pinning the reads to a single transaction ID throughout the transaction." Compared to READ COMMITTED"where reads always pick the latest committed version of any row, and where subsequent reads may return different results" — the snapshot is taken once and reused for the transaction's lifetime. This is the same foundational substrate that snapshot isolation in Postgres + other databases rides on, at the REPEATABLE READ name. 4. InnoDB uses two lock types + gap locks on top. Shared (S) locks let other transactions read the locked row but not modify it; multiple shared locks can coexist on the same row. Exclusive (X) locks let the holding transaction read and write, and block all other access. Gap locks are a third kind, specific to InnoDB: "Gap locking is a special type of lock that is used by MySQL to prevent phantom reads. Gap locking will use criteria in where clauses to lock the space around the read data, preventing rows from being inserted that may alter the query if its run a second time." Worked example verbatim: a transaction running SELECT * FROM products WHERE release_year = 1999 FOR UPDATE; causes a subsequent INSERT INTO products (... release_year, ...) VALUES (..., 1999, ...) in another transaction to "wait until the previous one had completed before proceeding." 5. FOR SHARE and FOR UPDATE turn a non-locking read into a locking read. By default, SELECT does not acquire row-level locks on any isolation level below SERIALIZABLE. Appending FOR SHARE to a SELECT creates a shared lock on each returned row; FOR UPDATE creates an exclusive lock. Canonical use: "Here is a modified version of the 'dirty read' example from above that uses a locking read. This approach would prevent another transaction from reading the data before it was updated: START TRANSACTION; SELECT * FROM products WHERE id = 20 FOR UPDATE; UPDATE products SET cost = 800 WHERE id = 20; COMMIT;" See locking reads. 6. REPEATABLE READ on a locking read: index hit = narrow row locks, no index hit = table scan + row locks on every scanned row + gap locks. Non-obvious corner of InnoDB semantics canonicalised in the post: "If an index can be used based on the where condition, MySQL will only lock the necessary rows that match the query. If an index is NOT used and the table is scanned, MySQL will lock all of the rows it reads regardless if they match the where, as well as perform gap locking to prevent inserts that may alter the data if the query is run multiple times." Canonical operational implication: missing indexes inflate lock scope under REPEATABLE READ, not just query latency. 7. SERIALIZABLE is REPEATABLE READ + implicit shared lock on every SELECT — performance cost + elevated deadlock risk. "This mode works exactly like Repeatable read, but implicitly creates a shared lock on all select statements whether you use for share or not. Due to the excessive locks used, there is a greater risk of deadlocks occuring." The level is mathematically complete (no permitted violations) but pays the price via serialised access and worse concurrency under write pressure. Canonical guidance from the post: use READ UNCOMMITTED when "performance takes priority over data consistency" (social-media-likes count example); default to REPEATABLE READ; reach for SERIALIZABLE only when correctness requirements explicitly demand forbidding all three violations. 8. Three mechanisms for setting isolation, with different scopes. (1) SET TRANSACTION ISOLATION LEVEL … affects the next transaction in the current session (the SESSION keyword is optional but errors inside an open transaction); (2) SET GLOBAL TRANSACTION ISOLATION LEVEL … requires CONNECTION_ADMIN privilege and affects future sessions only, not current ones, and each session remains free to override; (3) --transaction-isolation=… CLI flag or [mysqld] transaction-isolation = … config-file directive sets the server default before MySQL starts. Canonical operational datum: isolation level is a session-scoped knob by design — application code can (and commonly does) change it per-transaction for specific workloads.

Systems / concepts / patterns extracted

  • Systems: MySQL (the relational database whose isolation-level semantics the post canonicalises), InnoDB (MySQL's default storage engine and the actual implementer of MVCC + snapshot semantics + shared/exclusive/gap locks), PlanetScale (the publishing vendor; no PlanetScale-specific machinery is disclosed).
  • Concepts: MySQL transaction isolation levels (the four-level taxonomy with their violation-permission matrix, setting mechanisms, and performance-vs-consistency trade-offs), ACID properties (Atomicity, Consistency, Isolation, Durability — the Reuter & Härder 1983 framing MySQL's transaction semantics fulfil), dirty read (the violation class where one transaction reads another's uncommitted writes), non-repeatable read (same SELECT returns different row values within one transaction), phantom read (same SELECT returns a different number of rows within one transaction due to concurrent inserts), shared lock vs exclusive lock (InnoDB's two foundational row-lock kinds — S is reader-compatible, X is exclusive), gap locking (InnoDB's mechanism for preventing phantom reads under REPEATABLE READ on locking reads — locks the space around matching rows, not just the rows themselves), locking read (the developer- facing FOR SHARE / FOR UPDATE clause that upgrades a plain SELECT into a shared / exclusive lock).
  • Patterns: None. The post is pedagogical reference material; the substrate it canonicalises is engine-level mechanism, not reusable cross-system architecture.

Operational numbers

  • 4 isolation levels: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ (MySQL default), SERIALIZABLE.
  • 3 read violations: dirty read, non-repeatable read, phantom read.
  • 3 lock types in InnoDB named by the post: shared (S), exclusive (X), gap lock. (The full InnoDB lock taxonomy also includes next-key locks = record + gap, insert-intention locks, and table-level metadata locks, but the post stays at the three-type pedagogical altitude.)
  • 2 locking-read clauses: FOR SHARE (shared lock), FOR UPDATE (exclusive lock).
  • 3 isolation-setting mechanisms: SET [SESSION] TRANSACTION ISOLATION LEVEL … (next-transaction scope), SET GLOBAL TRANSACTION ISOLATION LEVEL … (future-sessions scope, requires CONNECTION_ADMIN), --transaction-isolation=… CLI flag or [mysqld] transaction-isolation = … config-file directive (server-startup default).
  • Default innodb_lock_wait_timeout: not in this post but referenced on row-level lock contention: 50 seconds before Lock wait timeout exceeded fails outright.

Caveats

  • Pedagogy voice, not production retrospective — Brian Morrison II's post is a syllabus-style teaching piece using a toy CD-store schema (products, unit_status, customer_transactions, customers) with 2-unit worked examples. No production numbers, no incident narrative, no PlanetScale-specific isolation-level defaults or customisations disclosed.
  • SQL-standard vs MySQL-specific phantom-read protection elided. The post states REPEATABLE READ prevents phantom reads in MySQL, which is true but non-standard: the SQL standard's REPEATABLE READ permits phantom reads (see Berenson et al. 1995, "A critique of ANSI SQL isolation levels"). Only MySQL's InnoDB-specific gap-locking strengthening makes MySQL's REPEATABLE READ prevent them. Databases like PostgreSQL implement REPEATABLE READ as formal snapshot isolation, which also prevents phantoms but via MVCC rather than gap locks — same outcome, different mechanism. The post doesn't flag this engine-divergence.
  • SERIALIZABLE implementation not deep-dived. Asserts SERIALIZABLE is REPEATABLE READ + implicit shared locks, but doesn't walk through the actual SSI-vs-S2PL tradeoffs other engines make (PostgreSQL's SERIALIZABLE is SSI with write-skew detection; SQL Server's is based on S2PL; MySQL's InnoDB is S2PL with next-key locks). Consequence: the deadlock risk warning is correct in shape but the specific mechanism differs across engines.
  • Write-skew anomaly not mentioned. A transaction anomaly that's permitted by snapshot isolation / REPEATABLE READ but forbidden by SERIALIZABLE, where two concurrent transactions each read-then-write disjoint rows based on an invariant that holds for neither committed state. The post's three-violation framing (dirty / non-repeatable / phantom) is incomplete without write-skew; users moving from SERIALIZABLE to REPEATABLE READ for performance may silently introduce write-skew bugs.
  • Lost update anomaly not mentioned. Similar to write-skew: two transactions each read a row, modify, and write it; one transaction's write silently overwrites the other's. Forbidden by SERIALIZABLE (and by REPEATABLE READ under locking reads); permitted by READ COMMITTED and READ UNCOMMITTED without care.
  • Pedagogy-mode concurrency examples are 2-row schemas. Real contention on hot rows (the row-level-lock-contention / hot-row problem case) is not demonstrated. See the slotted counter pattern for the production incident framing.
  • MVCC mechanism hand-waved. Says InnoDB uses MVCC without walking through row versioning, rollback-segment storage, or the undo log. The InnoDB system page covers this substrate; the article treats it as a black box.
  • No mention of innodb_lock_wait_timeout or LOCK WAIT / deadlock-detection machinery. SERIALIZABLE's "greater risk of deadlocks" is flagged but the operator-visible diagnostics (SHOW ENGINE INNODB STATUS, LOCK WAIT, auto- rollback of deadlocks) are absent.
  • No cross-shard / cross-replica implications. In Vitess-sharded environments, isolation levels are per-tablet (i.e., per-shard-of-a-single-MySQL- instance); cross-shard transactions have their own consistency story ( Consistent Lookup Vindex + ordered commit without 2PC) that doesn't compose cleanly with the single-node isolation taxonomy.
  • Phantom-read examples mix unit_status + products tables. The phantom-read worked example claims "the summation of the quantity available would be different between the two select statements" because new rows were inserted into unit_status, but the example shows the insert on products — a minor inconsistency in the post's schema mapping.
  • MySQL 8 utf8mb4_0900_ai_ci / charset-collation dimension elided. The string-comparison isolation knob (collation) is orthogonal to transaction isolation but worth a pointer; not given.

Source

Last updated · 378 distilled / 1,213 read