CONCEPT Cited by 1 source
Copy-on-write storage fork¶
Definition¶
A copy-on-write storage fork is a storage-cloning
mechanism that creates a second logical copy of a
dataset without initially duplicating the underlying
pages — the clone and the original share the same
physical pages on disk until one side writes, at which
point the modified page is copied and the two logical
copies diverge at that page only. The clone is
effectively instantaneous (no data-copy work at fork
time) and incremental in ongoing storage cost (pay
only for divergent pages). This is the storage-tier
analog of Unix fork() with copy-on-write memory
pages.
On Amazon Aurora, copy-on-write storage forks are the substrate under blue/green deployments: the green environment is created as a storage fork of the blue environment's cluster volume, then glued together by binlog replication for ongoing sync of committed transactions.
Trade-offs¶
Copy-on-write forks solve the fast-clone cost problem (no duplicate storage up-front, instantaneous clone) but introduce new failure modes:
- Divergent-page cost accrues silently — every write on either side creates a divergent page; the two-environment storage bill grows with write volume, not fork age.
- Concurrent writes on both sides = unreconcilable state — if both blue and green accept writes to the same row, the two sides now have divergent physical pages for the same logical data. The underlying storage layer has no knowledge of which version is "correct" — conflict resolution is deferred to the operator.
- Binlog replication doesn't round-trip copy-on-write — the clone mechanism is at the storage layer; cross-environment sync happens at the transaction layer via binlog replication, which has its own schema-change envelope (see [[concepts/ binlog-replication]]).
Seen in¶
- sources/2026-04-21-planetscale-planetscale-branching-vs-amazon-aurora-bluegreen-deployments — Brian Morrison II (PlanetScale, 2024-02-02). Canonical wiki disclosure of Aurora's blue/green copy-on-write storage clone + the two-side-writeable data-consistency risk. Verbatim: "Amazon's blue/green deployment initially duplicates only compute resources and clones data storage using a copy-on-write mechanism. This can help with storage costs when running parallel environments but introduces potential data inconsistencies across environments. Since writes are allowed in the green environment, the same data can technically be changed in both environments. If this happens, Amazon has no easy or automated way to reconcile which version is correct. Resolving conflicts is challenging, and the responsibility for data consistency falls on you."
Related¶
- concepts/blue-green-deployment — the deployment strategy on top of the copy-on-write clone.
- concepts/binlog-replication — the ongoing-sync mechanism between the two forked environments.
- systems/amazon-aurora — canonical implementation.
- patterns/blue-green-database-deployment — the Aurora-family pattern built on the clone.