PATTERN Cited by 1 source
Live WAL protocol switch via XLOG_FPW_CHANGE¶
Intent¶
Roll out a breaking change to the WAL protocol contract
between compute and storage on a live fleet without
customer restarts by piggybacking on an existing Postgres
control record (XLOG_FPW_CHANGE) that both sides already
understand.
The control record is a pre-existing Postgres concept — it was designed to let a running database change its FPW mode without restart — and the compute + storage tiers can use its appearance in the WAL stream as an in-log feature flag: once storage sees the record for a given compute, it knows to handle that compute's subsequent WAL stream under the new contract.
Canonical instance: Lakebase / Neon, late March → 2026-05-07¶
From the 2026-05-07 Databricks post, verbatim:
The change was applied to running computes via our control plane and storage system, which coordinated the transition automatically. This was achieved using the existing Postgres
XLOG_FPW_CHANGE WALrecord mechanism, meaning no restarts or interruptions were required for our customers.
(Source: sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes)
Rollout arc:
- Late March 2026: first customers switched to the new protocol (compute disables Full Page Write, storage-side image generation takes over).
- ~6-week rollout window: control plane + storage system coordinate per-compute switches across the global fleet.
- 2026-05-07: "active for all Lakebase Serverless and Neon databases globally".
- Zero customer restarts — the
XLOG_FPW_CHANGEWAL record signals the change atomically within the running compute's own log stream.
Why a pre-existing control record is the right vehicle¶
Classical Postgres already uses XLOG_FPW_CHANGE to mark
changes in the full_page_writes setting within a live
cluster — so both the compute-side emitter code and the
storage-side consumer (pageserver) already know how to parse
and interpret it.
This means Lakebase could roll out the new protocol without:
- Adding a new WAL record type (which would require backward-compat parsing on storage and a flag day where storage had to be upgraded before any compute could emit the new record).
- Adding an out-of-band coordination channel (which would require a distributed flag state that compute and storage both observe consistently).
- Restarting any compute (which breaks the "serverless, can scale to zero" product contract).
- Coordinating a fleet-wide flag day (which carries operational risk on a multi-thousand-cluster global fleet).
Instead: the existing control record carries the flag in-line with the data it controls, making the switch atomic and visible to exactly the party that needs to know (the storage tier handling that compute's WAL).
Structure¶
Per-compute switch sequence:
1. Control plane decides compute C is ready for the switch
(criteria: healthy state, recent backup, sane workload
profile, fleet-wide rollout quota not exceeded).
2. Control plane sends an internal signal to compute C
changing its full_page_writes configuration.
3. Compute C emits an XLOG_FPW_CHANGE record into its WAL
stream at the current LSN.
4. The record flows through safekeeper to pageserver.
5. Pageserver sees the XLOG_FPW_CHANGE record and, from
that LSN onward, handles compute C's WAL stream under
the new protocol:
- no more FPW records (so no reset-point-from-compute)
- start generating images locally per the
image-generation-pushdown threshold
6. Reverse direction is identical: control plane can send
another signal, compute emits another XLOG_FPW_CHANGE
going back to FPW-on, pageserver stops generating
images from that LSN.
When it fits¶
- Breaking protocol changes between two tiers that both read the same log (write-ahead log / redo log / event log).
- A pre-existing control record that both tiers already parse, even if it was designed for a different (but related) purpose.
- Per-peer granularity — the record appears in one compute's log and affects only that compute's storage interaction; other computes are unaffected until they independently switch.
- Idempotent or forward-only switches — applying the record once vs twice should have the same effect; the switch shouldn't corrupt state if the record is replayed during recovery.
- Live fleets where downtime is expensive — the architectural effort of a live-switch mechanism pays off at fleet scale.
When it doesn't fit¶
- No pre-existing control record is available — you'd have to introduce one, which is essentially the flag-day scenario this pattern avoids.
- Changes that break the log format — new records or
new delta-field semantics would prevent the old reader
from parsing the log at all.
XLOG_FPW_CHANGEworks because the field semantics of records after it are unchanged — only the presence or absence of FPW records changes. - Cross-tier changes beyond the log — if the storage tier's behaviour change affects APIs visible to other systems (not just the compute↔storage WAL protocol), an in-log flag doesn't reach those systems.
- Changes that require atomic fleet-wide switch — a per-compute sequential rollout won't satisfy a "all computes switch at the same moment" requirement.
- Regulatory / forensic constraints — if WAL must contain a self-describing account of state at every point, adding a protocol-flag record changes what the log means.
Failure modes¶
- Rollout stalls with fleet in a split state. Some computes on new protocol, some on old. Mitigation: the pattern is split-state-tolerant by design (pageserver handles both), but operational complexity increases; limit rollout-pause duration.
- Storage tier forgets the switch state after a restart.
Pageserver needs durable memory of each compute's current
mode — it can recompute this by scanning for the latest
XLOG_FPW_CHANGErecord in the stream, but this needs to be part of pageserver recovery logic. - Customer workload regresses after switch. Not every workload benefits from image-generation pushdown; the control plane needs observability to detect regression and auto-switch back. Databricks does not disclose the observability criteria.
- Ordering ambiguity on concurrent writes during switch.
Records in flight when the switch is emitted need to be
handled correctly under both old and new protocol. The
in-log nature of
XLOG_FPW_CHANGEmakes this unambiguous (records before theXLOG_FPW_CHANGELSN are old-protocol; after are new-protocol), but implementation on the pageserver side still has to be correct. - Control-plane-storage-system divergence. If control plane thinks compute C is on new protocol but storage system is still reading old-protocol records for C, reconciliation required. The in-log flag is the source of truth; control plane merely initiates.
Generalisation beyond Postgres¶
The pattern is an instance of a broader idea:
Feature flags in the log. When two distributed systems communicate via a stream of records (not RPC), the in-stream flag record is the cleanest way to switch protocol behaviour atomically and per-peer, because the flag and the data it controls travel on the same ordered channel.
Sibling instances on other substrates:
- Kafka + consumer protocol upgrades: producer emits a magic-byte-version-change record; consumers from that offset onward switch decoding.
- Event-sourcing + schema evolution: aggregate emits a
SchemaChangedTo(v2)event; downstream projectors switch from v1 to v2 decoding at that offset. - MySQL binlog protocol switches: new binlog event types can be introduced gated by a control event that older replicas recognize.
- Raft log membership changes:
AddMember/RemoveMemberrecords in the Raft log are how membership transitions are made without requiring external coordination outside the log itself.
Relationship to adjacent patterns¶
- Composes with patterns/image-generation-pushdown-to-storage — this pattern is the rollout mechanism for the architectural change; they were co-introduced by Lakebase in 2026-05-07.
- Sibling to patterns/progressive-configuration-rollout — both are gradual live rollouts; that pattern covers config-plane distribution (Cloudflare Snapstone), this pattern covers protocol-layer changes via in-log records.
- Contrast with patterns/global-configuration-push — that is a fleet-wide anti-pattern push; this is a per-peer in-log switch.
- Sibling to concepts/feature-flag — the in-log record is essentially a feature flag, but one that travels on the same ordered channel as the data it controls.
Seen in¶
- sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes
— canonical first-class wiki pattern page.
XLOG_FPW_CHANGEused as the in-log feature-flag vehicle to roll out image-generation pushdown across the global Lakebase + Neon fleet over ~6 weeks (late March 2026 → 2026-05-07) with zero customer restarts. Control plane coordinates with storage system via the in-log record, avoiding both flag-day risk and customer-facing downtime.
Related¶
- patterns/image-generation-pushdown-to-storage — the architectural change this pattern rolls out.
- patterns/progressive-configuration-rollout — sibling progressive-deployment discipline for configs.
- patterns/global-configuration-push — the anti-pattern this pattern is careful to avoid.
- concepts/postgres-full-page-write — the primitive whose mode is being toggled.
- concepts/wal-record-granularity — WAL as structured-records, including control records.
- concepts/feature-flag — the broader concept; this is a specialisation to in-log flags on a write-ahead-log substrate.
- systems/postgresql — upstream substrate that provides
the
XLOG_FPW_CHANGEcontrol record. - systems/lakebase — canonical production instance.
- systems/pageserver-safekeeper — the storage-tier components that observe and act on the in-log flag.