PATTERN Cited by 1 source
Action log vs state log replication¶
The design-space split¶
Every database engine that supports downstream CDC + HA must answer two orthogonal design questions:
- Is the replication log an action log or a state log?
- Action log: each entry is a transaction identity + change payload, independent of the physical storage layout (MySQL binary log).
-
State log: each entry is a physical-redo record that describes modifications to on-disk pages (Postgres WAL at
wal_level=replica;wal_level=logicalenriches it with enough column data for logical decoding but the substrate is still a physical redo log). -
Where does downstream-consumer-progress metadata live?
- Consumer-local (the consumer persists its own cursor; servers hold no per-consumer state — MySQL GTID).
- Primary-local catalog (the server tracks each
consumer's position in a local catalog table — Postgres
pg_replication_slots). - Replicated to standbys (Postgres 17 failover slots partially move here, preserving an eligibility gate).
The canonical combinations in production:
| Substrate | Action vs state log | Consumer progress | HA-CDC coupling |
|---|---|---|---|
| MySQL binlog | Action log | Consumer-local (GTID) | None |
| Postgres WAL (pre-17) | State log | Primary-local catalog (slot) | Strong — slot is lost on failover |
| Postgres 17 failover slot | State log | Replicated with eligibility gate | Medium — gated on subscriber advance |
See concepts/ha-cdc-coupling for the operational consequences.
The two properties that decouple HA from CDC¶
When the log is an action log and consumer progress is consumer-local, three properties follow:
- Every replica is a valid CDC resume point. A GTID-aware consumer can point at any replica that has matching binlog retention and resume.
- HA actions don't touch the CDC contract. Promote a replica, point the consumer at the new primary (or any other replica), resume.
- CDC-subscriber behaviour is irrelevant to HA scheduling. The operator's HA actions proceed on the operator's schedule; a lagging or offline consumer can't block them.
The property is enabled by log_replica_updates=ON in
MySQL — every replica re-emits applied transactions into its
own binlog, preserving GTID continuity across the full
cluster.
The two properties that couple HA to CDC¶
When the log is a state log and consumer progress lives in a primary-local catalog (Postgres logical-replication slot), the inverse holds:
- Only the primary can advance the slot. Consumer-progress metadata is primary-attached; on failover it's gone unless mirrored.
- Mirroring creates its own gate. Postgres 17 failover slots require the subscriber to have advanced the slot while the standby was following — to preserve exactly-once CDC semantics. The eligibility gate means a quiet subscriber blocks HA.
- Operational coupling. "Slot progress is a single-node concern that must be coordinated across the cluster at failover time, and eligibility depends on subscriber behavior outside your control."
When to pick which¶
- Pick action-log + consumer-local when CDC consumers are
independent of operator control (third-party batch CDC,
external analytics pipelines, Debezium fleets), when HA
action latency must not couple to consumer freshness, and
when you can tolerate paying the
log_replica_updates=ONdisk + CPU overhead on every replica to maintain full- cluster re-emission. - Pick state-log + primary-catalog when CDC consumers are first-party + well-behaved (managed CDC platform internally operated), when exactly-once CDC semantics are load-bearing, and when failover is rare enough that the operational coupling is an acceptable trade-off for the simpler primary-side progress tracking.
Beyond databases¶
The same design-space split appears in log-based distributed-systems substrates:
- Apache Kafka consumers persist offsets client-side (action-log + consumer-local) — any broker replica becomes a valid resume point.
- Vitess VStream at the VTGate level is explicitly action-log shaped — the VGTID is consumer-local and carried across shards (see concepts/unified-change-stream-across-shards).
- Systems that store consumer cursors server-side (broker offset commits to ZooKeeper in old Kafka, server-side pointers in some message queues) re-introduce HA coupling.
Seen in¶
- sources/2026-04-21-planetscale-postgres-high-availability-with-cdc — canonical wiki statement of the design-space split. Sam Lambert (PlanetScale CEO, 2025-09-12) frames the two databases side by side: MySQL's action-log-with-consumer- local-GTID vs Postgres's state-log-with-primary-local-slot (enriched to replicated-with-eligibility-gate in Postgres 17). Canonical closing framing: "the brittle edge in Postgres high availability with logical consumers: slot progress is a single-node concern that must be coordinated across the cluster at failover time, and eligibility depends on subscriber behavior outside your control."