CONCEPT

Check-then-act race¶

A check-then-act race is the concurrency hazard that arises when a single logical operation is implemented as two or more non-atomic steps — check whether some condition holds, then act based on that check — with no lock or serialisation primitive acquired between the two steps. Two concurrent executors both complete the check step before either has completed the act step, both see the same pre-action state, both proceed to act, and one or both of their acts conflict when they reach the point of commit. Also known as TOCTOU (Time-Of-Check-Time- Of-Use) from the Unix-security literature.

Canonical shape¶

Thread A                  Thread B
--------                  --------
SELECT WHERE key=k  →     SELECT WHERE key=k
  (empty)                   (empty)
INSERT (k, ...)     →
                          INSERT (k, ...)
                          ← DUPLICATE KEY ERROR

The pattern generalises: check can be a SELECT, a filesystem stat, an ACL lookup, a lock-file existence check, a cache-probe; act can be an INSERT, a filesystem rename, a privileged syscall, a cache-fill. What makes it a race is that neither operation is serialised against concurrent mutations between the two steps.

Canonical production instance¶

PlanetScale's Rafer Hazen disclosed a genuine production check-then-act race in the Insights error-tracking launch post (Source: ). The DatabaseBranchPassword Rails model carries a Rails-idiomatic uniqueness validation:

validates :display_name, uniqueness: { scope: [:database_branch_id] }

ActiveRecord's uniqueness validator is implemented as a SELECT against the target column(s) followed — if the select returned nothing — by the INSERT that persists the row. No row-level lock is acquired on the nonexistent key between the SELECT and the INSERT. When two requests to the DatabaseBranchPasswords#create controller action arrived "at nearly the same time", both their SELECTs found the display_name free, both proceeded to INSERT, and MySQL's unique index ["database_branch_id", "display_name"] rejected one at commit time with AlreadyExists.

Error volume was "a few 10s to a few hundred per day" and "occurrences of these errors come in small batches with nearly identical timestamps" — the sub-second temporal clustering was the first-order signal that the bug was concurrency not volume.

Why advisory validation can't solve this¶

The application-layer uniqueness check is structurally advisory: it returns a friendly error on the common uncontested path, but it cannot guarantee uniqueness under concurrency because nothing in the SELECT + INSERT sequence holds a lock on the nonexistent key. The database-layer unique index is authoritative: the storage engine serialises at commit time, so at most one of the two racing INSERTs can succeed.

This is why — as PlanetScale's post canonicalises — "the database, as the final arbiter of uniqueness, would then only allow one of these queries to succeed and the other would receive the error." Application- layer uniqueness validation is a UX affordance layered on top of the database's authoritative guarantee, not a replacement for it. See patterns/database-as-final-arbiter-of-uniqueness for the architectural corollary.

Why dev-environment tests miss it¶

The race only manifests under concurrent request arrival. Single-threaded dev-environment tests interleave check → act → check → act sequentially, so the second check observes the first act's row and the ActiveRecord validation catches the duplicate. Exposing the race in test requires explicit concurrency injection: run two threads both calling create, or use a database-level tool that delays one transaction's commit while another completes. Rafer discloses this verbatim: "we verified in a development environment that the uniqueness validation seemed to be working correctly: when we try to create two identical DatabaseBranchPassword rows, we get an ActiveRecord error showing that the name has already been taken."

The dev-environment success is not a validation bug — it is a coverage gap. Unit / integration tests without explicit concurrency injection cannot reproduce any check-then-act race, regardless of how well the single-threaded behaviour is tested.

Resolution options¶

Once the race is understood, the fix is one of:

Serialise the caller — if the racing requests come from a single machine-client (an internal tool, a cron job, a batch processor), make the caller issue requests sequentially. This removes the race at the source but does nothing for a multi-machine caller. PlanetScale's chosen fix: "an internal tool was issuing multiple password create requests in parallel. Our solution was simply to modify the script to avoid that behavior."
Hold a lock between check and act — wrap the operation in a transaction with SELECT ... FOR UPDATE on a sentinel row keyed on the uniqueness scope (e.g. the parent database_branch_id), or on a dedicated lock table. This serialises concurrent access at the DB layer but adds lock-wait latency and requires the sentinel row to exist (a new-scope create still has the race unless there's a higher-level sentinel).
Use an atomic conditional-insert primitive — INSERT ... ON DUPLICATE KEY UPDATE (MySQL), INSERT ... ON CONFLICT DO NOTHING (Postgres), INSERT ... SELECT ... WHERE NOT EXISTS ..., or rescue the database's unique-violation error (ActiveRecord::RecordNotUnique) and return the friendly error the validation would have returned. This treats the database error as the authoritative signal, with validation remaining as an advisory fast path for the common uncontested case.
Idempotency token on the request — see concepts/idempotency-token. If the operation is re-playable and the client can generate a stable idempotency key, the server can deduplicate racing replays using a dedicated idempotency table.

The correct load-bearing fix is (2), (3), or (4); option (1) is pragmatic but fragile to future callers that re-introduce parallelism.

concepts/row-level-lock-contention — check- then-act is a race (two operations race to commit); row-level lock contention is serialisation (two operations take turns). The former produces uniqueness violations or lost-update anomalies; the latter produces tail latency.
concepts/deadlock-vs-lock-contention — deadlock is a circular wait on locks; check-then-act holds no locks to deadlock on. Introducing a lock to fix a check-then-act race can introduce deadlock if the lock order is wrong.
Lost-update race — conceptually adjacent: two readers read, both write back derived values, one write clobbers the other. Same check → act shape, different act: write vs insert.

Seen in¶

— Canonical disclosure. Rails uniqueness validation
MySQL unique index on the database_branch_password table racing on concurrent DatabaseBranchPasswords#create calls from a single internal tool. "Two separate application threads could both query the database at the same time, before either thread had created the record, and then proceed as if there was no issue."