Skip to content

PLANETSCALE 2026-02-22 Tier 3

Read original ↗

PlanetScale — What is a database transaction?

Summary

Ben Dicken's illustrated explainer defines a database transaction as a sequence of SQL statements executed as a single atomic unit, bracketed by begin; and commit; (or rollback;) in MySQL and Postgres. The article walks through transactions from first principles — the commit/rollback semantics, the need for transaction-level isolation between concurrent sessions, and the two radically different MVCC implementations that MySQL and Postgres use to deliver REPEATABLE READ behaviour: Postgres keeps multiple row versions (stamped with xmin/xmax transaction IDs, reclaimed later by VACUUM FULL) while MySQL overwrites in place and reconstructs old versions from an undo log. It then covers the four standard SQL isolation levels (Serializable → Repeatable Read → Read Committed → Read Uncommitted), framed by which of dirty reads, non-repeatable reads, and phantom reads each level forbids. The article closes on write-write conflict resolution, again showing two divergent approaches: MySQL uses exclusive row locks (and kills a transaction on deadlock), while Postgres uses predicate locks plus optimistic conflict detection (no blocking, but kills transactions that violate SERIALIZABLE guarantees at commit time). The piece is pedagogical — explicitly pitched at newcomers — but serves as a first-party PlanetScale canon mapping the MySQL↔Postgres MVCC/locking split that PlanetScale must support in both of its managed products.

Key takeaways

  • A transaction is an all-or-nothing sequence of SQL statements bracketed by begin and commit. Trillions execute across SQL databases daily. commit atomically applies all changes the transaction made; rollback discards them as if the transaction never began (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Databases handle unexpected rollback causes (hard-drive failure, power outage) via disaster-recovery mechanisms — Postgres specifically via the write-ahead log (WAL) — while rollback covers intentional undo when midway a transaction encounters missing data or a client cancellation (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Before commit, a transaction's changes are isolated from every other session. Session B running a SELECT on the same row does not see Session A's update from benjoe until Session A commits. If Session A rolls back, Session B never sees the change (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Both MySQL and Postgres support consistent reads at REPEATABLE READ and stricter modes — but via completely different mechanisms. Postgres keeps multiple versions of each row (multi-versioning); MySQL overwrites and reconstructs old versions from an undo log (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Postgres tags every row version with xmin (creating transaction ID) and xmax (superseding transaction ID). When transaction ID 10 updates user 4 from "liz" to "aly", the old row's xmax is set to 10 and a new row with xmin=10 is inserted. Which version each transaction sees is a function of its own ID vs. the xmin/xmax values (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • VACUUM FULL is Postgres's compaction mechanism for duplicated row versions. Over time, updates accumulate row versions no running transaction can still see; VACUUM FULL purges them and compacts the table, reclaiming unused space (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • MySQL rows carry two metadata columns: xid (ID of the transaction that most-recently updated the row) and ptr (reference to the most recent undo-log modification). When transaction A clobbers a version transaction B needs, B reconstructs the previous value(s) from the undo log; multiple undo-log records can exist for the same row simultaneously, and MySQL selects the correct version by transaction identifier (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • SQL's four isolation levels, strongest to weakest: Serializable, Repeatable Read, Read Committed, Read Uncommitted. Each is characterised by which undesirable phenomena it forbids vs. allows. Stronger isolation = more correctness, worse performance in some scenarios (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • The three classic read anomalies: A phantom read is when a repeated SELECT returns a different row-set because another transaction inserted matching rows. A non-repeatable read is when a re-read finds changes by another committed transaction — dangerous because assumptions were already made. A dirty read is when a transaction can see uncommitted data from another simultaneous transaction (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Postgres REPEATABLE READ forbids phantom reads even though the SQL standard permits them at that level — an implementation strictness above-spec (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Two kinds of row locks matter: shared (S) and exclusive (X). Multiple S locks can coexist on a row (multiple concurrent readers). Only one X lock can be held at a time, and no other lock of any kind can coexist with it (needed for writes) (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Write-conflict resolution diverges sharply between MySQL and Postgres at SERIALIZABLE. MySQL always takes X locks on update — deadlock is possible, and MySQL detects it and kills one transaction. Postgres uses predicate locks (non-blocking markers on the row-sets a transaction touches, e.g. WHERE id BETWEEN 10 AND 20) plus optimistic conflict detection — it never blocks waiting for a lock, but kills a transaction at commit time if a concurrent transaction violated SERIALIZABLE guarantees (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).
  • Both approaches require application-side retry logic — in MySQL-SERIALIZABLE mode because of deadlock kills, in Postgres-SERIALIZABLE mode because of optimistic-commit aborts. Important transactions must have retry built in (Source: sources/2026-02-22-planetscale-what-is-a-database-transaction).

Systems surfaced

  • MySQL — exemplar of undo-log-based MVCC and lock-based (pessimistic) write-conflict resolution, with deadlock detection as the escape hatch.
  • InnoDB — the MySQL storage engine where xid/ptr row metadata, shared/exclusive row locks, and the undo log actually live.
  • PostgreSQL — exemplar of multi-version MVCC with xmin/xmax tuple metadata and predicate-lock-based optimistic conflict resolution. VACUUM FULL is the compaction mechanism for accumulated row versions. Postgres's REPEATABLE READ famously forbids phantom reads against SQL-standard spec.

Concepts surfaced

Contradictions / caveats

  • The article hand-waves Serializable as accomplished by "complex locking and waiting" — true for MySQL, but as the later Postgres section reveals, Postgres's SERIALIZABLE (SSI) does not block and does not wait at all. These are very different implementations under the same isolation name.
  • Phantom-read prevention at REPEATABLE READ in Postgres is noted as an implementation quirk — worth flagging when comparing anomaly matrices across engines. The SQL standard allows phantoms at this level.
  • The article is pedagogical and high-level; it does not quantify performance trade-offs across isolation levels, nor does it cover write skew (the classic SSI counterexample that distinguishes Snapshot Isolation from true Serializability — see concepts/snapshot-isolation).
  • No operational numbers (latencies, throughput, vacuum frequencies) are given; the piece is a conceptual introduction, not a tuning guide.

Source

Last updated · 517 distilled / 1,221 read