CONCEPT Cited by 1 source
Synchronous Block Replication¶
Synchronous block replication is a storage-layer replication semantic where every write to the primary block device is replicated to a secondary device and acknowledged by the secondary before the primary acknowledges the write back up the stack. The filesystem, application, and user never see a write that isn't on both nodes.
Trade-off shape¶
The defining trade-off is write latency vs. durability guarantee:
- Every write pays a network round-trip to the peer before ack. Upper-bounds peer-to-peer RTT determines the write latency floor.
- In exchange: no acknowledged write can be lost on primary failure. The standby can be promoted with zero data loss.
This is the right trade for HA where the business cost of losing an acknowledged write exceeds the added latency — file stores, databases, metadata services. It's the wrong trade for bulk data pipelines where throughput matters more than write-level consistency.
Contrast with asynchronous replication — writes ack locally, replication happens in the background. Lower latency, but failover can lose in-flight writes.
Mechanism¶
The canonical Linux primitive is DRBD
(Distributed Replicated Block Device). DRBD sits below the
filesystem as a virtual block device (/dev/drbdN) and:
- Accepts a write from the filesystem.
- Sends the write to the secondary over the network.
- Waits for secondary ack.
- Acks the write back up to the filesystem.
Filesystems run unmodified on top (ext4, XFS, etc.); neither the filesystem nor the application is aware of replication.
Split-brain concerns¶
If the network between active + standby partitions, both may believe they are primary and diverge. DRBD and similar systems offer split-brain detection (generation counters) and policies (discard changes on one side, manual merge). Production deployments pair with a cluster manager that fences one side on partition detection.
Seen in¶
- sources/2025-09-02-github-rearchitecting-github-pages — GitHub Pages uses DRBD to synchronously replicate user-site data across 8 partitions between the active + standby fileservers in each R720 pair. The trade-off is accepted because the origin tier serves static files where write latency is dominated by the push pipeline, not serve-time; losing an acknowledged publish would be a correctness bug.
Related¶
- systems/drbd — canonical Linux implementation.
- concepts/active-standby-replication — HA shape DRBD enables.
- systems/github-pages — canonical wiki instance.