PATTERN Cited by 1 source

Graceful propagation before demotion¶

Intent¶

On a planned leadership change in a lock-based consensus system, have the current leader first ensure that its outstanding requests have reached the new leader's required follower set, then demote itself — eliminating the propagation race entirely.

Motivation¶

The seven propagation failure modes Sugu Sougoumarane enumerates in Part 7 all arise from the failure-driven propagation regime — the elector fencing unreachable followers to indirectly revoke leadership, then discovering completed requests with incomplete knowledge. Planned changes don't have to run that gauntlet: the current leader is reachable by construction, so it can satisfy the durability invariant before the hand-off:

"For lock-based systems, and for planned changes, we have the opportunity to request the current leader to demote itself. In this situation, the current leader could ensure that its requests have reached all the necessary followers before demoting itself. Once this is done, the elector performs the leadership change and the system can resume." (Source: sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-7-propagating-requests)

Mechanism¶

Elector decides a planned leadership change is needed (software rollout, capacity rebalance, operator-driven).
Elector signals the current leader to enter a demotion-prep state.
Current leader waits for all outstanding requests to reach the new leader's durability threshold — i.e., waits until every in-flight request has been replicated to enough followers that the new leader will find it during its establishment scan.
Current leader signals ready-to-demote.
Elector runs the leadership change against a quiescent request pipeline.
New leader takes over; because all previously-completed requests were made durable under the old leader's watch, the new leader's durability-discovery pass finds them trivially.

Why this works¶

The failure-driven propagation pathway's complexity comes from the elector's incomplete knowledge — it may or may not find every incomplete request, and cannot tell durability from tentativeness. Graceful propagation eliminates that ambiguity by having the authoritative actor (the current leader) close out the durability work before the knowledge-transition. By step 4, every outstanding request is either: (a) fully durable on the new leader's follower set, or (b) explicitly cancelled by the current leader. No incomplete-discovered ambiguity can arise.

Composition with prior pattern¶

Extends patterns/graceful-leader-demotion with the pre-demotion-propagation step. Graceful demotion (Part 4) was about draining the current leader's in-flight transactions through lameduck + query buffering at the proxy tier; graceful propagation (Part 7) is about ensuring the durability invariant is already met across the old+new leaders' follower intersection before the hand-off. Both are planned-only optimisations; both break down under unreachable-leader conditions (crashes, partitions).

Prerequisites¶

Current leader is reachable. Crash / partition → falls back to failure-driven path + per-request versioning.
Lock-based election. Lock-free systems don't have a stable current leader to negotiate with; the newer-proposal-number elector can arrive at any moment and the graceful negotiation window doesn't exist as a concept.
Client or proxy-tier buffering for the demotion-prep window, so the application doesn't see request rejections during the quiescence step.

Canonical instance: Vitess `PlannedReparentShard`¶

Vitess's PRS operation is the canonical production instance: lock-based (etcd lock) + current-leader-reachable + lameduck-drain + vtgate-level query buffering + propagation-completion invariant before hand-off. See patterns/graceful-leader-demotion for the Part-4 framing of the drain+buffer mechanism; this pattern adds the Part-7 propagation-completion invariant on top.

Trade-off vs failure-driven path¶

Graceful: zero propagation races, zero application-visible errors, requires reachability + lock-based substrate.
Failure-driven: handles unreachable leaders, requires per-request versioning or anti-flapping to stay correct.

A production system runs both paths: graceful for planned changes (daily; software rollouts), failure-driven for unplanned (monthly; crashes). Sugu's Part-4 "optimise for the common case" framing applies here verbatim — planned changes dominate the leadership-change workload, so designing graceful-propagation-first pays off structurally.

Seen in¶

sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-7-propagating-requests — canonical wiki introduction of graceful propagation as the planned-change complement to the failure-driven propagation path; extends Part 4's graceful-demotion pattern with the propagation-completion invariant.