CONCEPT Cited by 1 source
Lock-state self-synchronizing update¶
A self-synchronizing atomic update is the optimisation of clearing specific bits of a packed state word not by load-modify-CAS but by atomically adding a specific delta such that — if the word is in the expected state — the add zeroes the target bits in one instruction.
The trick¶
Given a word W that you expect to have bits S set, you want
to clear the bits M ⊆ S. The straightforward implementation:
— needs a CAS retry loop. The self-synchronizing version is:
which is equivalent to W & ~M if W == prev_value. The
fetch_add is a single atomic instruction on most architectures;
no retry loop; no CAS failure branch.
Why it works¶
If prev_value has exactly M set (and nothing else),
prev_value.wrapping_sub(M) = 0 and fetch_add(0) is a no-op
→ wrong. If prev_value has M set (and possibly other bits):
prev_value.wrapping_sub(M) is the two's-complement inverse of
M minus the other bits; adding it to prev_value clears M
and leaves the rest. Because two's-complement arithmetic wraps,
the same add clears M for any value of W that equals
prev_value.
The invariant is that W is exactly prev_value at the
moment of the add. If it's anything else, you've added a very
large value to the word.
Why it's used¶
Single-instruction atomics (e.g. LOCK XADD on x86) are
substantially faster than CAS-retry loops under contention.
parking_lot's RwLock uses this
trick extensively in its wake-up paths.
Why it's dangerous¶
From Fly.io's 2025-05-28 post:
"This pattern is self-synchronizing, but it relies on an invariant: you'd better be right about the original state of the word you're altering. Because if you're wrong, you're adding a very large value to an uncontrolled value." (Source: sources/2025-05-28-flyio-parking-lot-ffffffffffffffff)
The bug class it enables is bitwise double-free — two code paths each think they're the one clearing a bit; the first's clear succeeds; the second's add doesn't cancel because the state has already advanced; the word overflows.
When to use¶
- When profiling shows a CAS-retry loop is the contention hot spot and the number of distinct state transitions is small enough to enumerate.
- When every code path that clears a given bit provably agrees on which transition it's implementing (validated by formal methods, code review, or rigorous test coverage of the happy paths and race windows).
When not to¶
Any time the "who clears this bit" answer is distributed
across multiple wake-up paths that can execute in either
order — e.g.
parking_lot's pre-PR-#466
try_write_for timeout path + reader-release-unpark path
both claiming to clear WRITER_PARKED.
Seen in¶
- sources/2025-05-28-flyio-parking-lot-ffffffffffffffff —
Canonical wiki instance. The optimisation is
parking_lot's, not Fly.io's; Fly.io's contribution is the bug report + the PR that restructures the wake-up paths so the invariant holds.
Related¶
- systems/parking-lot-rust — The library that canonicalised the optimisation.
- concepts/bitwise-double-free — The bug class the optimisation enables when the invariant is violated.
- patterns/upstream-the-fix — How
parking_lotPR #466 restructured the wake-up paths without abandoning the optimisation. - companies/flyio — Fly.io.