CONCEPT

Phantom read¶

A phantom read is a read-anomaly class that occurs when a transaction issues the same SELECT twice and gets a different number of rows between the two reads — because another transaction inserted new rows matching the WHERE predicate in between. Unlike non-repeatable reads (where existing rows' values change), phantom reads are about new rows appearing in a range previously believed to be stable.

Canonical worked example¶

From PlanetScale's Brian Morrison II (Source: ):

A store owner wants to count products. Their transaction runs two identical SELECT COUNT(*):

START TRANSACTION;

-- Returns 100
SELECT COUNT(*) FROM products;

-- Now returns 101
SELECT COUNT(*) FROM products;

COMMIT;

Between the two SELECTs, a concurrent transaction inserts a new product:

START TRANSACTION;
INSERT INTO products (album, artist, release_year, cost)
  VALUES ('The Battle Of Los Angeles', 'Rage Against The Machine', 1999, 1500);
COMMIT;

The owner's two counts disagree not because any existing row changed, but because a new row appeared — a phantom.

Which isolation levels permit it¶

Level	Phantom read
`READ UNCOMMITTED`	allowed
`READ COMMITTED`	allowed
`REPEATABLE READ`	prevented*
`SERIALIZABLE`	prevented

*MySQL-specific: Phantom-read prevention at REPEATABLE READ level is a MySQL-specific strengthening of the SQL standard. Per Berenson et al. 1995 ("A critique of ANSI SQL isolation levels"), the SQL-standard REPEATABLE READ permits phantom reads; only SERIALIZABLE formally forbids them. InnoDB closes the gap at REPEATABLE READ via gap locking on locking reads.

Mechanism: InnoDB's gap-locking¶

Phantom reads are tricky to prevent because they concern rows that don't yet exist at the time of the first read. Locking the rows the reader observed doesn't help — the phantom is a row not-yet-inserted into the space between observed rows.

InnoDB's solution is gap locking: when a transaction does a locking read with SELECT … FOR UPDATE (or SELECT … FOR SHARE) under REPEATABLE READ, InnoDB locks not only the matching rows but also the gaps around them, specifically determined by the WHERE predicate's range. An INSERT into a locked gap is blocked until the locking transaction commits.

Canonical demonstration from the article:

-- Transaction A
START TRANSACTION;
SELECT * FROM products WHERE release_year = 1999 FOR UPDATE;
UPDATE products SET cost = 800 WHERE release_year = 1999;
COMMIT;

-- Transaction B, concurrent
INSERT INTO products (album, artist, release_year, cost)
  VALUES ('The Battle Of Los Angeles', 'Rage Against The Machine', 1999, 1500);

With gap locking in effect, Transaction B's INSERT is blocked until Transaction A commits — because the new row's release_year = 1999 would fall into the gap Transaction A has locked. Verbatim from the post: "MySQL would make that transaction wait until the previous one had completed before proceeding."

Non-locking reads under `REPEATABLE READ`¶

Importantly, a plain SELECT (no FOR UPDATE / FOR SHARE) under REPEATABLE READ also doesn't observe phantoms — but for a different reason: the transaction is pinned to a snapshot taken at first read (see isolation levels), and any row inserted after that snapshot-time is simply invisible to reads in the pinned transaction, even after the inserter commits.

So there are two mechanisms at play under MySQL's REPEATABLE READ:

Plain SELECT: MVCC snapshot hides new rows from the reader's view.
Locking SELECT … FOR UPDATE/SHARE: gap locks block concurrent inserters from creating phantoms.

Both produce phantom-free reads; they differ in whether concurrent writers are blocked (locking reads) or allowed to proceed (non-locking).

When phantom reads matter¶

Anywhere a transaction depends on the absence of rows:

Uniqueness checks done as check-then-insert — "is this username taken? no → insert it" — classic race condition permitting phantom reads, fixed by unique indexes (database-level enforcement).
Range-based pagination with stable counts — "how many comments total on this post?" expected constant between fetches within one transaction.
Range constraints ("does this booking overlap any existing booking in the range?") — check-then-insert bookings can silently double-book under phantom-permitting isolation.
Reports that aggregate subsets — "sum of orders placed between dates X and Y" must include the same set of orders across all SELECTs in the report-generation transaction.

Dirty read — observing uncommitted data. Phantom reads are both reads of committed data.
Non-repeatable read — existing row's value changed. Phantom reads are about row existence (specifically, new rows appearing).
Write-skew — two transactions read-then-write disjoint rows based on a shared invariant that breaks after both commit. Not a phantom read; not in the classical three- violation framing; permitted by snapshot isolation.
Snapshot isolation (Postgres-style REPEATABLE READ) prevents phantom reads via MVCC (snapshot hides new rows). Same outcome as MySQL's gap-locking approach for reads; different mechanism.

Seen in¶

— Brian Morrison II's PlanetScale pedagogical post canonicalises phantom reads as the third of the three violations, with the CD-store product-count worked example and the InnoDB-specific gap-locking mechanism as the prevention path on locking reads under REPEATABLE READ. Verbatim framing: "Phantom reads occur when the actual rows returned between select statements in a transaction differ because another transaction has inserted rows before the first can complete." The MySQL-specific strengthening of REPEATABLE READ to prevent phantoms (vs the SQL- standard version that permits them) is canonicalised via the FOR UPDATE + concurrent-insert worked example.