PATTERN Cited by 1 source
Cross-DC semi-sync for durability¶
Shape of the pattern¶
Place semi-sync replicas in a different datacenter / availability zone from the primary, so that an ack from a semi-sync replica implies the transaction is persisted outside the primary's failure domain. A commit acknowledged to the user is therefore durable even against a total loss of the primary's DC.
Shlomi Noach's canonical framing:
"If we only run semi-sync replicas in a different datacenter than the primary's, we first pay with increased write latency, as each commit runs a roundtrip outside the primary's datacenter and back. With multiple semi-sync replicas it's the time it takes for the fastest replica to respond. When the primary goes down, we have the data durable outside its datacenter." (Source: sources/2026-04-21-planetscale-mysql-semi-sync-replication-durability-consistency-and-split-brains)
The pattern is the natural answer to the weakness Noach identifies when semi-sync replicas live in the same DC as the primary: a same-DC semi-sync ack confirms durability to a replica that the primary's DC outage can take out simultaneously. Cross-DC placement closes that window.
Mechanism¶
The pattern is topological, not configurational — rpl_semi_sync_master_wait_for_slave_count=k is untouched. What changes is the physical placement of the replicas that carry the semi-sync plugin:
- Tag the semi-sync replica role by DC. Only replicas in DCs other than the primary's participate.
- Size the cross-DC replica count to match
wait_for_slave_count+ a survival margin. Fork=1, at least 2 cross-DC replicas is typical (so a single replica failure doesn't trigger timeout fallback). - Deploy local non-semi-sync replicas in the primary's DC for fast read traffic and failover-seeding (they just don't ack).
When the primary commits, it waits for acks from the fastest cross-DC replica. Commit latency = max(local write, fastest cross-DC RTT + replica disk flush).
Trade-offs¶
| Axis | Same-DC semi-sync | Cross-DC semi-sync |
|---|---|---|
| Write latency | ~1ms (intra-DC RTT + disk) | 5–50ms (inter-DC RTT + disk) |
| Durability vs DC outage | ✗ same DC fails together | ✓ data persisted outside failure domain |
| Failover complexity | Low — promote local replica | Higher — cross-DC promotion has application impact |
| Operational cost | Lower (no cross-DC bandwidth) | Higher — semi-sync traffic is committed data + metadata |
The latency tax is the price paid. Noach observes: "With multiple semi-sync replicas it's the time it takes for the fastest replica to respond" — mitigated by deploying multiple cross-DC replicas so the tail isn't worst-of-all.
The failover procedure¶
On primary-DC outage:
- Verify the primary is truly gone. Cross-DC semi-sync replicas' relay logs contain all acknowledged writes — but so do non-semi-sync replicas in the primary DC, which may be more up-to-date. Noach notes: "we can then also compare with non semi-sync replicas in the primary's datacenter: they may yet have all the transactions, too."
- Decide: promote in primary's DC or remote. Promoting in the primary's DC (seeded from a cross-DC replica if needed) is usually less disruptive to applications; promoting remotely means the replication cluster's primary-DC is effectively reassigned.
- Reassign the semi-sync replica set if promoting remotely — the new primary's DC must not contain semi-sync replicas, so some reconfiguration is required. "We then must reassign semi-sync replicas, and ensure none run from within the new primary's datacenter."
What this pattern does NOT buy¶
Consistency under partition. If the primary's DC is network-isolated rather than down — still alive, just unreachable — the pattern does not prevent the primary from continuing to commit to any still-reachable cross-DC semi-sync replica. The pattern buys durability; it does not buy the consistency of a consensus protocol. See concepts/durability-vs-consistency-guarantee.
Split-brain immunity. The 1-n split-brain topology still applies: the primary plus one cross-DC replica forms a minority quorum that can commit in isolation. See concepts/minority-quorum-writeability. Operational mitigations (fencing the old primary, anti-flapping on reparenting) remain necessary.
When to use¶
- Regulatory / business requirement that acknowledged writes cannot be lost to a single DC.
- Architecture tolerates cross-DC write latency on the hot path — typically OLTP systems where p50 commits are fine but regulatory durability is mandatory.
- Multi-DC deployment is already in place for read scalability or DR; adding semi-sync cross-DC is a topology adjustment, not a new capability.
When not to use¶
- Latency-sensitive workload where cross-DC RTT dominates commit time. The pattern is incompatible with sub-millisecond commit SLOs.
- Small deployments (2-DC or single-region). The marginal cost is high and the semantic benefit is small — a full consensus protocol or a cross-region replica in a third site is often the better choice.
Seen in¶
- sources/2026-04-21-planetscale-mysql-semi-sync-replication-durability-consistency-and-split-brains — canonical wiki introduction. Noach walks through the deployment-shape decision explicitly, contrasting same-DC, cross-DC, and mixed postures; cross-DC-for-durability is the posture that actually satisfies the semi-sync contract against DC-outage failure modes.
Related¶
- concepts/mysql-semi-sync-replication — the mechanism this pattern deploys.
- concepts/durability-vs-consistency-guarantee — the axis this pattern moves, not the other.
- concepts/minority-quorum-writeability — the structural property this pattern does not fix.
- concepts/split-brain — the residual hazard operators must still address.
- concepts/semi-sync-timeout-fallback — ensuring fallback does not silently undo the durability gain.
- patterns/infinite-semi-sync-timeout — the tuning companion that prevents silent fallback.
- patterns/pluggable-durability-rules — the architectural response that would make "at least one ack from ≥2 DCs" a first-class constraint.
- systems/mysql
- systems/vitess