PATTERN Cited by 1 source
Lock-free leader election¶
Pattern¶
Resolve races between multiple electors by making the elector with the newest (highest) proposal number / term number the winner, regardless of which elector started first. Followers remember the highest proposal number they have seen for the revoke/establish protocol and reject any message with a lower number. The protocol has three moving parts:
- Pick a new proposal number strictly greater than anything observed.
- Send revoke-with-proposal-number to all potential candidates (not just the known leader).
- Send establish-with-proposal-number to the chosen candidate.
Lock-free is one of the two race-resolution families Sugu Sougoumarane names in Part 5; the other is lock-based ("first elector wins").
Canonical framing¶
Sugu's taxonomic move in Part 5:
"In the case of a lock-free approach, the newest elector must win over an older one. This requires the algorithm to assign a time-based order to the electors that are racing with each other. In Paxos, these are referred to as proposal numbers. In Raft, these are known as term numbers. To facilitate reasoning, we can view these numbers as timestamps."
"The core of the lock-free approach is that the followers that accept an establishment or revocation request must remember the timestamp of the elector that issued the request and should reject requests with older timestamps." (Source: sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-5-handling-races)
Proposal numbers and term numbers are unified into a single proposal number concept for the taxonomic argument.
Two convergence branches¶
Sugu's two-branch convergence analysis under a race between an old-timestamp elector and a new-timestamp elector:
- Old completes first → the new one revokes the old's leadership and establishes its own. "The elector with the older timestamp completes its election before the one with the newer timestamp. Following this, the one with the newer timestamp will end up revoking that leadership and establishing its own."
- New completes first → the old one's messages are rejected on arrival. "The elector with the newer timestamps completes first. Then the one with the older timestamp will fail at its attempt, and the leadership established by the newer timestamp will prevail."
Either way, the newer-timestamp elector's leadership prevails. This is the source of forward progress — no timeout needed.
The revoke-all-potential-candidates safety rule¶
Proposal numbers alone are not sufficient to close the race between a slow old elector and a new one. Sugu names the matching safety invariant:
"To cover the above scenarios, every elector must assume that there may be another elector with an older timestamp attempting a leadership change. It must therefore attempt to revoke leadership from all potential candidates, not just the current known leader. The completion of this process ensures that all possible leaderships (present and future) with an older timestamp are invalidated. This addresses the case where an old elector is slow at performing its actions. This also adds safety against clock skews: a new leader with an incorrect older timestamp will just fail at completing a leadership change, as if it was an older leader."
The rule absorbs two failure modes under a single invariant: - Slow predecessor — an old elector that hasn't finished yet. - Clock-skew backward — a "new" elector with a mistakenly-low timestamp.
Both fail harmlessly: the new elector revokes the old's potential leadership; the skewed-backward elector just fails its own election as if it were the older leader.
Primary advantage: free forward progress¶
Forward progress is intrinsic to the protocol:
"The main advantage of a lock-free approach is that it naturally supports forward progress. If an existing elector fails, a different elector can initiate a new round without knowledge of the state or age of an older elector. For this reason, there is no need to depend on timeouts."
A crashed elector never blocks a later one — a fresh elector just picks a higher proposal number and proceeds; followers reject the old one's stale messages on arrival. No time-component required; no clock assumptions required at the protocol layer.
Primary disadvantage: no stable leader → quorum reads¶
Sugu's structural cost argument:
"The disadvantage of a lock-free approach is that there is no certainty of a stable leader. This is because it is possible for a leadership to end between the time you discover it and give it a request. The absence of a stable leader makes consistent reads complicated. We essentially have to resort to quorum reads."
At scale, this is the dominant cost factor: thousands of reads per second per cluster × extra RTT per read → structural overhead that lock-based systems avoid via the leader lease. See patterns/lock-based-over-lock-free-at-scale for the full four-advantage trade-off that tips the verdict.
Canonical instances¶
- Paxos (classic / single-decree) — the canonical lock-free algorithm; proposal numbers are the race-resolution primitive. Used in narrow high-value domains (Cassandra's lightweight transactions, Zookeeper's internal leader-election inside a larger system).
- Raft — lock-free at the protocol layer (term numbers), but operationally used as a lock-based substrate because elections are rare and leaders are long-lived. See patterns/lock-based-leader-election for the argument that Raft is lock-based in substance.
- Paxos variants (Multi-Paxos, Fast Paxos) — same race-resolution family; same stable-leader issue, usually mitigated by layering a lease on top.
When lock-free is the right choice¶
- Very small clusters where lease management is more overhead than the protocol it saves.
- Genuinely one-shot decisions (DNS update distribution, SSL certificate rotation) with no stable-leader value to preserve.
- Environments without reliable clocks or external coordinators where the lock-based family's time-component is structurally unsafe.
- Correctness under arbitrary scheduling required with no elegance-compromise acceptable — Sugu names this honestly: "a lock-free approach (like what Paxos uses) has elegance from the fact that it does not have a time component."
Relationship to proposal numbers at the request layer¶
Sugu's Part 5 forward-reference, closed in Part 7:
"If you are using a lock-based approach, there is no need to use proposal numbers or leadership terms for the purpose of electing a leader. But we may still need to assign such a number to facilitate the propagation of requests. We will discuss this in a subsequent blog."
Proposal numbers are the election-layer mechanism in lock-free systems; they reappear in lock-based systems at the request-propagation layer as request versioning. Same mechanism, two different layers. The factorisation is the whole point of the revoke-and-establish split.
Seen in¶
- sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-5-handling-races — canonical wiki introduction of the lock-based vs lock-free taxonomy; unifies Paxos proposal numbers with Raft term numbers; names the revoke-all-potential-candidates safety rule; the no-stable-leader-at-scale argument that motivates patterns/lock-based-over-lock-free-at-scale.