Skip to content

PATTERN Cited by 1 source

Async replication for cross-region, semi-sync within region

Pattern

Configure semi-synchronous replication only between replicas within the same region (where cross-AZ latency is single-digit milliseconds) and asynchronous replication for cross-region replicas (where RTT is 60ms+). The rule is derived directly from network physics: semi-sync's durability contract requires the primary to wait for a replica ack on every write, so the per-transaction latency cost is pinned to the replica's round-trip time. Paying a 60ms+ RTT on every write for a cross-region replica's ack is almost always unacceptable; an async cross-region replica with bounded but non-zero replication lag is the canonical alternative.

The canonical framing — Morrison on PlanetScale

Brian Morrison II's 2023-11-15 best-practices post canonicalises the rule:

"When building in the cloud, you'll have the ability to deploy services to almost anywhere in the world, and this includes databases too. Each cloud provider is made up of a number of geographical regions. Within those regions are multiple data centers that are close enough to be considered in the same region, but far enough to survive many disasters. These data centers are known as availability zones or AZs for short. If possible, replicating your database to other physical locations is definitely a best practice, but it comes with some additional considerations."

"The time it takes to send data over the wire is one such consideration. Since the network traffic needs to travel over a farther distance, replicating between locations does introduce additional latency between replicas. Luckily, the latency between AZs is often not very high. In fact, AWS claims that they have single-digit millisecond latency between availability zones in the same region."

"Regions are much farther from one another and often have significant latency. At the time of this writing, cloudping.co reported that the latency between us-east-1 and us-west-1 is over 60ms. Replication in itself has a bit of a delta between the time that data is written to the source and the time it is written to a replica, known as replication lag. This is exacerbated when replicating across longer distances."

"As such, replicating across regions should be done in asynchronous mode so as to not cause unnecessary delay for the application making requests." (Source: sources/2026-04-21-planetscale-mysql-replication-best-practices-and-considerations)

The network-latency denominators

Topology Latency (rough order) Semi-sync viability
Same-host sub-millisecond ✅ trivially fine
Same-AZ sub-millisecond to ~1 ms ✅ fine
Cross-AZ within region single-digit milliseconds (AWS-cited) ✅ acceptable — PlanetScale's canonical posture
Cross-region (same continent) 20-80 ms (cloudping.co) ❌ too expensive per-write
Cross-region (transoceanic) 100-200+ ms ❌ prohibitive

The 60ms+ us-east-1us-west-1 number is the canonical wiki denominator Morrison anchors the rule on. Intercontinental round-trips are typically 100-200ms+; semi-sync at that cost means every write takes 200+ ms, which breaks OLTP SLAs for most applications.

Why the rule holds

Semi-sync's per-write cost = replica RTT. The primary blocks on every commit until the semi-sync replica's relay-log ack returns. That's one round-trip, every write.

Async's per-write cost = zero replica wait. The primary commits and responds as soon as its own binlog flush completes. The replica catches up when it can; the lag is the cross-region latency plus any queue depth in the replication stream, but it's paid on a separate timescale from the user request.

The rule is therefore not a configuration tip; it's a physics-mandated trade-off: if your application's write SLA is less than 2× the cross-region RTT, cross-region semi-sync is infeasible. For all typical OLTP workloads (write SLA < 100ms), any cross-region topology > 50ms RTT forces async.

Composition with mixed-mode within region

This pattern composes cleanly with mixed sync + async within region:

Region A (primary's region)
├─ Primary
├─ Semi-sync replica (failover candidate, same-AZ or cross-AZ ~1-10ms)
└─ Async replicas (read capacity within region)

Region B (DR / cross-region)
└─ Async replicas (read capacity + DR; ~60ms+ lag)

Region C (farther DR)
└─ Async replicas (DR; ~150ms+ lag)

The primary pays semi-sync latency once — to its in-region semi-sync replica. Cross-region replicas contribute zero to write latency. The in-region semi-sync replica is the deterministic failover candidate; cross-region replicas are not suitable failover candidates (their lag-at-crash-time is unbounded) but are adequate for DR promotion if the entire primary region fails.

Consequences

  • Read-your-writes from cross-region replicas is hard — async lag plus cross-region RTT means the window during which a just-committed write is invisible on a cross-region replica is > 60ms plus lag. patterns/session-cookie-for-read-your-writes and patterns/per-region-read-replica-routing mitigate by routing RYW-critical reads to the primary or to the in-region semi-sync replica.
  • DR failover loses writes — if the primary region fails entirely, promoting a cross-region async replica means losing every write that hadn't yet propagated. RPO is bounded by the async replication lag at failure time.
  • Application write SLA determines acceptable semi-sync RTT — if your write SLA is 50ms, your semi-sync replica must be < 40ms away. If your writes target < 10ms, the semi-sync replica must be same-AZ.
  • Managed-substrate implementations push this rule into product defaults — PlanetScale Portals (systems/planetscale-portals) exposes cross-region replicas as async-only; Vitess replication similarly defaults cross-region links to async; AWS RDS Multi-AZ is semi-sync within region + cross-region is separate feature (cross-region read replicas, async by default).

Seen in

Last updated · 378 distilled / 1,213 read