PATTERN Cited by 1 source
Mixed sync + async replication topology¶
Pattern¶
In a multi-replica MySQL cluster, configure exactly one replica with semi-synchronous replication and the remaining replicas with asynchronous replication. The primary waits for the semi-sync replica to persist each transaction in its relay log before acknowledging the commit, so that replica is guaranteed to have every committed write. The async replicas receive the same writes from the same binlog but without the wait-for-ack overhead — they lag the primary by a bounded but variable amount.
The load-bearing property: the semi-sync-flagged replica is the known-good failover candidate by construction — you do not need to query all replicas' GTID positions at failover time to find the furthest-ahead, because the semi-sync replica is provably at least as far ahead as the primary's acknowledged transactions.
The canonical framing — Morrison on PlanetScale¶
Brian Morrison II's 2023-11-15 best-practices post canonicalises this composite:
"It's also worth mentioning that you can mix and match these two modes. If you want to guarantee that one specific server always contains an up-to-date copy of your database, but also want additional replicas for more resiliency, you could configure one replica with semi-sync and one without. This means when data is written to the source, it will always make sure that the one server with semi-sync enabled has received that transaction before responding, and the other replicas in the cluster will catch up when they can. In a disaster scenario (discussed further down this article), this can help you easily identify the best candidate to recover from." (Source: sources/2026-04-21-planetscale-mysql-replication-best-practices-and-considerations)
Structure¶
┌──────────────────────┐
│ Primary (writer) │
└──────────┬───────────┘
│ binlog stream
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Semi-sync replica │ │ Async replica 1 │ │ Async replica 2 │
│ (failover │ │ (read capacity) │ │ (read capacity) │
│ candidate) │ │ │ │ │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
MUST ack every commit may lag indefinitely may lag indefinitely
before primary responds
The primary's commit path:
- Apply the transaction locally.
- Write to binlog.
- Stream to all replicas (including the semi-sync one).
- Block until the semi-sync replica acknowledges receipt into its relay log.
- Do not wait for async replicas.
- Return to client.
Why the composite works¶
Write latency: the primary pays semi-sync cost once per transaction — the single round-trip to the fastest reachable replica (typically same-AZ, single-digit ms). Async replicas do not contribute to write latency at all.
Durability: at commit-ack time, the transaction exists on at least two servers (primary + semi-sync replica). This is strictly stronger than pure async (where the primary's commit-ack means nothing about replica state) and only marginally weaker than synchronous-to-all (which would require every replica to ack, exposing the primary to the slowest replica's tail latency).
Read scale: adds as many async replicas as needed for read capacity, each independently provisioned, with no multiplier on write latency.
Failover simplicity: at primary-crash time, the semi-sync replica is by construction the candidate with the most recent data. No GTID-comparison dance across replicas; no "which replica is furthest ahead?" ambiguity. Promote the semi-sync replica, re-point the rest, done. See concepts/unplanned-failover-playbook step 2.
Trade-offs¶
- The semi-sync replica is a single point of durability — if it is partitioned or crashes, the primary's semi-sync contract degrades. Two failure modes:
- With a finite
rpl_semi_sync_master_timeout(default 10s), the primary falls back to async after the timeout. Commits resume but the durability contract is silently broken until the semi-sync replica returns. - With
rpl_semi_sync_master_timeoutset extremely high (the PlanetScale posture — see concepts/mysql-semi-sync-replication), the primary blocks waiting for the semi-sync replica. Write latency degrades catastrophically if the semi-sync replica is slow or gone. - Write-unavailability risk under mixed-mode + high timeout: the composite of "extremely high semi-sync timeout" + "exactly one semi-sync replica" means a single replica failure can stall writes. Operators who want both high durability and high availability must provision more than one semi-sync replica (and set
rpl_semi_sync_master_wait_for_slave_countaccordingly) or accept the blocking risk. - Async replicas can be arbitrarily behind — if a read goes to an async replica, the application must tolerate replication lag. Read-your-writes workflows need sticky routing to the primary or the semi-sync replica.
- Not a substitute for cross-region durability — all three replica classes described here are assumed to be within a single region. For cross-region durability, see patterns/async-replication-for-cross-region — semi-sync's latency overhead is prohibitive across regional RTT.
When to apply¶
- You want single-replica-guaranteed durability at low cost — you don't want to pay semi-sync cost for every replica, but you want at least one replica guaranteed to have every committed write.
- You need more read capacity than durability replicas — e.g. 1 semi-sync + 4 async = 1 durability slot + 5 total read slots.
- You want deterministic failover-candidate selection — skip the GTID-position comparison dance by construction.
- Your failure model assumes individual replica failures are rare relative to primary failures — if replica failures are common, the single-semi-sync architecture is brittle; prefer N-way semi-sync with quorum.
When not to apply¶
- Multi-region clusters — pay async cross-region; put the semi-sync replica(s) within the primary's region (see patterns/async-replication-for-cross-region).
- Workloads requiring multi-replica durability — use
rpl_semi_sync_master_wait_for_slave_count≥ 2 and more than one semi-sync replica. - Latency-critical workloads on unreliable replica network paths — extremely high timeout + single semi-sync replica can tip into write unavailability on minor replica hiccups.
Seen in¶
- sources/2026-04-21-planetscale-mysql-replication-best-practices-and-considerations — canonical wiki disclosure of the mixed-mode composite with the semi-sync-replica-as-deterministic-failover-candidate rationale.
- Implicit in PlanetScale's production posture ("PlanetScale actually uses semi-synchronous replication for our databases within a given region") though the exact replica-count split is not disclosed.
Related¶
- concepts/mysql-semi-sync-replication — the mechanism the semi-sync replica uses.
- concepts/asynchronous-replication — the mode the other replicas use.
- concepts/active-passive-replication — the enclosing topology.
- concepts/unplanned-failover-playbook — the failover procedure that benefits from this pattern's candidate-selection determinism.
- concepts/gtid-position — the primitive that makes candidate-selection possible at all; mixed-mode makes it deterministic.
- concepts/primary-vs-replica-as-replication-source — related choice (which replica to use as a backup/migration source).
- patterns/read-replicas-for-read-scaling — the async replicas serve this purpose.
- patterns/graceful-leader-demotion — the planned-failover dual of unplanned-failover-playbook.
- patterns/async-replication-for-cross-region — the cross-region caveat.
- systems/mysql + companies/planetscale — substrate and first-party voice.