PATTERN Cited by 1 source
Operator-scheduled cutover¶
Problem¶
A staged-then-sealed migration (concepts/staged-then-sealed-migration) completes its copy phase whenever the shadow tables finish backfilling — which is a system-driven moment that depends on table size, workload, and shard count. For a small change that completes in 20 minutes during business hours, this is fine. For a long-running change that may complete at 2am on a weekend, the "completion moment" is terrible for the operator:
- The 30-minute revert window begins at completion. If an issue surfaces at 2:15am and the operator isn't online, the window closes unattended.
- Customer questions canonicalised in the 2022 Gated Deployments launch post:
"'why 30 minutes?', 'What happens if the deployment completes at 2:00am over the weekend, and I can't access my laptop in time?', 'Can we have better control over the timings?'"
- The operator has invested hours of wall-clock time in a deployment that now demands their presence at an arbitrary moment.
Extending the 30-minute revert window would be easy to ask for, but costly: [[patterns/instant-schema-revert-via- inverse-replication|inverse replication streams]] run per staged migration × duration; longer windows mean larger resource bills and a longer queue of post-cutover work. The more valuable move is to let the operator decide when the 30-minute clock starts.
Solution¶
Introduce a UI-level Auto-apply toggle on the deploy- request ceremony. When the toggle is on (default), the deploy controller seals the migration as soon as every change reports ready-to-seal. When the toggle is off, the deploy controller waits for the operator to click an Apply changes button — the controller continues to hold every migration in staged-then-sealed state, tailing the binlog continuously, for hours or days if necessary, with no natural time-out.
Canonical verbatim framing:
"By default, deployments auto-complete when ready, and this is great for most cases, and clears up the deployment queue. However, if the user so chooses, they may uncheck the 'Auto-apply' box. The deployment now stages all changes and runs all long-running tasks. When all changes are ready, the deployment awaits the user to hit the 'Apply changes' button. With no input from the user, the deployment will just keep on running in the background, always keeping up to date with data changes."
The staged-then-sealed mechanism is already the continuous binlog-tail shape; "operator clicks a button" is simply the trigger event for the seal step. The mechanism doesn't change — only the causality of when seal happens moves from system-time to operator-time.
Canonical worked example¶
The 2022 Gated Deployments launch post's weekday-return scenario:
"A deployment with three
ALTERstatements over large tables may take a day to run. It may be 2:00am on the weekend when it finally completes the hard work of copying the dataset. But it won't apply the changes: the deployment will just keep on syncing and responding to ongoing changes like anyINSERT,DELETEorUPDATEon the relevant tables. Come Monday morning, when the developer is at their desk and fully prepared to begin their work week, they may click the 'Apply changes' button. The deployment then completes, and the 30 minute window for schema reverts starts ticking, all while the developer is in control of the situation."
The specific operational properties being optimised for:
- The application-visible cutover event happens when the operator is available to monitor it.
- The 30-minute revert window's clock starts when the operator clicks, not when the system finishes.
- The operator inherits the staging cost for the Saturday-night-to-Monday-morning interval (binlog tail runs for ~60 hours) in exchange for control over the cutover moment.
When to use¶
Use operator-scheduled cutover when:
- The migration is long-running enough that system-driven completion may happen outside the operator's working hours.
- The 30-minute revert window is load-bearing for rollback confidence — the operator wants the window to begin on demand, not on clock.
- The app-visible change needs to align with other business events (a release announcement, a feature launch, a marketing push) that the deploy controller cannot see.
- The migration is a rehearsal — operator wants to stage the change to gain confidence, then schedule the real cutover for a specific maintenance window.
Do NOT use it when:
- The migration is small enough that system-driven completion will happen during business hours anyway — the auto-apply default is operationally simpler.
- Resources are constrained — every hour of deferred cutover holds shadow-table storage and binlog-tail machinery that could be used for other staged deploys.
- The operator cannot commit to clicking the button within a reasonable window — an indefinitely-held staged migration occupies resources and may block other deploy-requests.
- The change is urgent (incident-driven, security fix) — holding the gate trades cutover-timing control for slower incident response.
Trade-offs¶
- Indefinite staging cost. Holding a migration in staged-then-sealed state for days multiplies storage, binlog-retention, and CPU cost by the hold duration. Auto-apply-off doesn't move the cost, but it does lengthen the hold period the operator is willing to accept.
- Gate-queue depth grows. If multiple deployments
choose auto-apply-off, the deploy controller must
manage many concurrent staged migrations, each with
their own resource footprint. Ceiling is enforced by
schemadiffcomplexity limits + raw resource bounds. - Operator cognitive load shifts. Auto-apply-off moves the "when to cut over" decision from the system (implicit, on ready) to the operator (explicit, click). For routine deploys, this adds friction; for important deploys, it adds the control the friction is buying.
- Revert-window timing is now operator-chosen. The 30-minute clock starts on click. An operator who clicks at Monday-morning 9am has until 9:30am before the revert primitive expires — this aligns nicely with the work-day but introduces a policy decision the operator must own.
- Cross-team coordination changes shape. Under auto- apply, cutover is an observable event the team responds to post-hoc. Under operator-scheduled cutover, cutover is a scheduled event the team can plan around — good for release managers, different for incident responders.
Composes with¶
- concepts/staged-then-sealed-migration — the mechanism the pattern exposes a UI gesture over. The operator's click is the seal signal the deploy controller was going to issue anyway; the pattern simply moves the signal's causality.
- concepts/gated-schema-deployment — the deploy-unit-wide primitive that operator-scheduled cutover inherits. Gated deployment already bundles N changes into one seal event; operator-scheduling hands the seal-moment decision to the operator.
- concepts/cancel-before-cutover — before the click, the deployment is fully cancellable. The click is the causal boundary that converts a cancellable deploy into a sealed-plus-revert-windowed deploy.
- [[patterns/instant-schema-revert-via-inverse- replication]] — the 30-minute revert window's clock starts on click. Operator-scheduled cutover aligns the revert-window with operator availability.
- concepts/near-atomic-schema-deployment — the cutover itself is still near-atomic (seconds, not hours) regardless of when it's scheduled. The pattern doesn't change the shape of the cutover, only the timing.
Contrast with auto-apply default¶
| Dimension | Auto-apply on (default) | Auto-apply off (operator-scheduled) |
|---|---|---|
| Cutover trigger | System (on ready-to-seal) | Operator (click) |
| Cutover moment | Determined by backfill time | Determined by operator presence |
| Revert-window clock starts | On cutover completion | On operator click |
| Staging duration | Min (only what's needed) | Operator choice (potentially days) |
| Resource cost | Minimal (clears queue quickly) | Operator-controlled (matches hold duration) |
| Operational complexity | Low (one fewer decision) | Higher (operator owns cutover timing) |
| Typical use case | Routine small deploys | Long-running deploys + aligned business events |
Canonical implementation¶
PlanetScale's Deploy Request UI (2022) is the canonical wiki instance. The UI exposes the Auto-apply checkbox at deploy-request creation time; unchecking defers the cutover to an Apply changes button that appears once every change reports ready.
The 2022 Gated Deployments launch post shows two screenshots (referenced in the raw file but not inlined here): the Auto-apply-checkbox UI element and the completed-ready-to-merge state with the Apply changes button.
No non-PlanetScale implementation of the same gesture is disclosed on the wiki as of 2026-04-23. Atlas CLI, Skeema, and Bytebase support scheduled-cutover features but frame them as job-scheduler integrations rather than first-class staging primitives — they're extensions of single-DDL apply, not of the deploy-unit ceremony.
Seen in¶
- sources/2026-04-21-planetscale-gated-deployments-addressing-the-complexity-of-schema-deployments-at-scale — canonical wiki first disclosure. Shlomi Noach's 2022 Gated Deployments launch post introduces the Auto- apply toggle + Apply-changes-button UX as a first-class product primitive and canonicalises the weekend-at- 2am-motivation for the feature. The pattern is named at the UX altitude; the underlying mechanism (concepts/staged-then-sealed-migration) is canonicalised in the 2023 successor post.
Related¶
- concepts/gated-schema-deployment
- concepts/staged-then-sealed-migration
- concepts/cancel-before-cutover
- concepts/near-atomic-schema-deployment
- concepts/cutover-freeze-point
- concepts/online-ddl
- systems/vitess
- systems/vitess-vreplication
- systems/planetscale
- patterns/near-atomic-multi-change-deployment
- patterns/shadow-table-online-schema-change
- patterns/instant-schema-revert-via-inverse-replication
- patterns/staged-rollout
- companies/planetscale