CONCEPT Cited by 1 source
Postgres failover slot¶
Definition¶
A Postgres failover slot is the Postgres-17 mechanism (documented as logical replication failover) that serialises logical replication slot state into the WAL so that standbys can mirror the slot and — if they become eligible — continue to carry it across a primary promotion.
Before Postgres 17, slot state lived exclusively in the primary's
pg_replication_slots
catalog and was unavailable to standbys; promotion of a standby
meant the slot was gone, and any CDC subscriber using the slot
had to re-snapshot from scratch on the new primary. Failover
slots mirror the metadata to promotion candidates so, in
principle, the subscriber can resume from the same logical
position after promotion.
The three eligibility conditions on a standby¶
Failover-readiness is determined by three conditions on the standby, quoted verbatim from the canonical source:
- "The slot is synchronized on the standby,
synced = true." - "The slot's position in the WAL is consistent with the position of the standby, not too far behind or too far ahead."
- "The slot is persistent and not invalidated,
temporary = false AND invalidation_reason IS NULL."
The second condition is the load-bearing gate: a standby is ineligible until the subscriber has been observed advancing the slot at least once while that standby was following.
"A standby only becomes eligible to carry the slot after the subscriber has actually advanced the slot at least once while that standby is receiving the slot metadata. This guard exists to prevent promoting a node that has never observed real slot progress and would present an inconsistent stream to the subscriber." (Source: sources/2026-04-21-planetscale-postgres-high-availability-with-cdc)
Why the eligibility gate exists¶
Failover slots trade HA flexibility for exactly-once CDC semantics. Without the gate, a freshly-synchronised standby could be promoted and start advertising slot positions the subscriber never actually saw — duplicate or missing events. The gate ensures any standby that carries the slot post- promotion has seen the subscriber move through it.
"This preserves exactly-once CDC semantics at the expense of HA flexibility." (Source: this post.)
Failure scenarios the gate induces¶
- Quiet-period failover: if the subscriber has not advanced
recently, slots on standbys remain
temporaryand are not failover-ready; forced failover breaks CDC. - Replica replacement: new standbys from
pg_basebackupstart at a conservative point and remain ineligible until the next subscriber poll. If polling is every 6 hours, all new replicas are ineligible for that window. - Stalled switchover: operators must wait for CDC to advance or promote anyway and accept slot drop — write availability is coupled to CDC consumer behaviour that is often outside operator control.
See concepts/ha-cdc-coupling for the coupling this produces.
Seen in¶
- sources/2026-04-21-planetscale-postgres-high-availability-with-cdc — canonical wiki introduction of Postgres 17 failover slots + the three eligibility conditions + the subscriber- must-advance-while-standby-follows gate. Sam Lambert (PlanetScale CEO, 2025-09-12) discloses that failover slots solve the mirroring problem (serialise slot metadata into WAL) but preserve the eligibility gate by design — exactly-once CDC semantics at the expense of HA flexibility. Names three concrete scenarios where the gate blocks otherwise-safe primary promotion.