CONCEPT Cited by 1 source

Lock-state self-synchronizing update¶

A self-synchronizing atomic update is the optimisation of clearing specific bits of a packed state word not by load-modify-CAS but by atomically adding a specific delta such that — if the word is in the expected state — the add zeroes the target bits in one instruction.

The trick¶

Given a word W that you expect to have bits S set, you want to clear the bits M ⊆ S. The straightforward implementation:

loop {
    let old = W.load();
    let new = old & ~M;
    if W.compare_and_swap(old, new).is_ok() { break; }
}

— needs a CAS retry loop. The self-synchronizing version is:

let state = W.fetch_add(prev_value.wrapping_sub(M), Ordering::Relaxed);

which is equivalent to W & ~M if W == prev_value. The fetch_add is a single atomic instruction on most architectures; no retry loop; no CAS failure branch.

Why it works¶

If prev_value has exactly M set (and nothing else), prev_value.wrapping_sub(M) = 0 and fetch_add(0) is a no-op → wrong. If prev_value has M set (and possibly other bits): prev_value.wrapping_sub(M) is the two's-complement inverse of M minus the other bits; adding it to prev_value clears M and leaves the rest. Because two's-complement arithmetic wraps, the same add clears M for any value of W that equals prev_value.

The invariant is that W is exactly prev_value at the moment of the add. If it's anything else, you've added a very large value to the word.

Why it's used¶

Single-instruction atomics (e.g. LOCK XADD on x86) are substantially faster than CAS-retry loops under contention. parking_lot's RwLock uses this trick extensively in its wake-up paths.

Why it's dangerous¶

From Fly.io's 2025-05-28 post:

"This pattern is self-synchronizing, but it relies on an invariant: you'd better be right about the original state of the word you're altering. Because if you're wrong, you're adding a very large value to an uncontrolled value." (Source: sources/2025-05-28-flyio-parking-lot-ffffffffffffffff)

The bug class it enables is bitwise double-free — two code paths each think they're the one clearing a bit; the first's clear succeeds; the second's add doesn't cancel because the state has already advanced; the word overflows.

When to use¶

When profiling shows a CAS-retry loop is the contention hot spot and the number of distinct state transitions is small enough to enumerate.
When every code path that clears a given bit provably agrees on which transition it's implementing (validated by formal methods, code review, or rigorous test coverage of the happy paths and race windows).

When not to¶

Any time the "who clears this bit" answer is distributed across multiple wake-up paths that can execute in either order — e.g. parking_lot's pre-PR-#466 try_write_for timeout path + reader-release-unpark path both claiming to clear WRITER_PARKED.

Seen in¶

sources/2025-05-28-flyio-parking-lot-ffffffffffffffff — Canonical wiki instance. The optimisation is parking_lot's, not Fly.io's; Fly.io's contribution is the bug report + the PR that restructures the wake-up paths so the invariant holds.

systems/parking-lot-rust — The library that canonicalised the optimisation.
concepts/bitwise-double-free — The bug class the optimisation enables when the invariant is violated.
patterns/upstream-the-fix — How parking_lot PR #466 restructured the wake-up paths without abandoning the optimisation.
companies/flyio — Fly.io.