CONCEPT Cited by 1 source

Delta chain replay¶

Definition¶

Delta chain replay is the read-path mechanism a compute-storage-separated database uses to materialise a page on demand: find the most recent full page image of that page in storage, then apply every accumulated WAL delta record that affects that page on top of it, in order.

On the Neon-lineage storage tier (safekeeper + pageserver) that Lakebase inherits:

When Postgres compute requests a page from storage, the pageserver (a component of the Lakebase distributed storage system) reconstructs it by finding the most recent materialized image of that page and replaying any WAL deltas on top.

(Source: sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes)

The bounded-vs-unbounded property¶

The load-bearing property of delta-chain replay is whether the chain length is bounded or unbounded:

Bounded chain. Periodic full page images appear in the stream, resetting the chain. Read cost is O(reset cadence) per page — roughly constant, predictable, scalable.
Unbounded chain. No periodic images → the chain grows with write volume over the lifetime of the page. Read cost is O(time since creation) per page — spikes on hot pages, unbounded on frequently-updated pages, destroys read-latency SLOs.

From the 2026-05-07 Databricks post:

Without those periodic full page images in the log, the storage layer would have to replay an infinitely long chain of small deltas to reconstruct a page for a read request. What was once a bounded O(checkpoint frequency) replay becomes an unbounded chain, leading to a spike in read latency and resource consumption.

The incidental reset-point role of Full Page Writes¶

Classical Postgres Full Page Writes (FPW) are stated to exist for torn-page recovery on the write path. But they have an incidental load-bearing role on the read path in compute-storage-separated architectures: the periodic full page images they emit to WAL double as reset points in the delta chain the pageserver uses for read-time page reconstruction.

This is why classical Postgres's read path on a separated-storage architecture (Neon before Lakebase's 2026-05-07 change) worked correctly even though nobody designed FPW for read-path use: the same periodic images that enabled torn-page recovery also bounded the delta chain for reads. The link was implicit, not documented in the Postgres manual, and only became visible when the team tried to disable compute-side FPW and saw read latency regress.

The image-generation-pushdown remedy¶

Once FPW's incidental read-path role is understood, the right fix is to preserve the reset-point function while eliminating the write-path cost:

Keep periodic images in the storage tier.
Generate them on the storage side based on actual page-change rate, not on the unrelated compute-side checkpoint cadence.
Disable compute-side FPW so compute sends only compact deltas over the network.

The patterns/image-generation-pushdown-to-storage pattern canonicalises this architectural move. Per-page threshold (more delta records than N without an intervening image → generate one) replaces checkpoint-scoped FPW as the reset-point mechanism.

Measured outcome on Lakebase: 94% compute WAL-volume reduction + 30–50% p99 read latency drop + ~30% p50 read latency drop — both the write-path and the read-path improve because the new image-generation cadence is better-targeted than the checkpoint-coupled FPW cadence it replaces. (Source: sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes)

Operational knobs¶

Image-generation threshold. "More delta records than a configured threshold without an intervening image" — the threshold value is the main tuning parameter. Databricks does not disclose the specific value nor whether it is fleet-global or per-workload-tunable.
Pageserver parallelism. "Image generation for a project branch is now shared across multiple pageservers in the background" — image generation is a horizontally-scalable storage-tier workload, not a single-threaded compute-side bottleneck.
Cache hit rate. Client-visible read latency depends on whether the read hits the compute-local cache (where no delta replay is needed) or has to go to the pageserver (where delta replay runs). Delta-chain-length matters only on the pageserver path.

Structural properties¶

Always bounded or always unbounded. There is no middle ground — either the chain is reset periodically (bounded) or it isn't (unbounded).
Bounded cost is O(reset cadence / page-change rate). Hot pages (high change rate) need more frequent resets; cold pages need fewer.
Decoupled from write cadence is better than coupled to it. Checkpoint-scoped reset (classical Postgres) applies the same cadence to all pages regardless of how often they actually change. Per-page-threshold reset (image-generation pushdown) applies more resets to frequently-modified pages, fewer to cold ones — better-targeted work.
Reset-point placement affects both read latency and storage cost. More resets → shorter chains → faster reads but more pageserver work + more object-storage writes. Fewer resets → longer chains → slower reads but less storage-tier work. The threshold is a tuning parameter on this tradeoff.

Log-structured merge trees (LSM). RocksDB / LevelDB / Cassandra replay from a mutable memtable + immutable SST files. Compaction is the LSM analogue of image generation — merging smaller SSTs into bigger ones to bound read-path work. Image-generation pushdown is conceptually similar but at the page-address altitude rather than key-range altitude.
Event sourcing + snapshots. An event-sourced system replays events since the last snapshot to reconstruct current state. Snapshots play the reset-point role; more frequent snapshots = bounded replay = faster rebuilds at the cost of more snapshot-storage overhead.
Aurora storage-forwarded redo-log replication. Aurora's storage tier applies redo-log records to produce current page images; the 6-copy storage quorum handles durability. Lakebase's image-generation pushdown is the direct Postgres analogue with the addition of per-page image-threshold decisions on the storage side.

Seen in¶

sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes — canonical first-class wiki framing of delta-chain replay, with the bounded-vs-unbounded property articulated and the incidental reset-point role of FPW named explicitly. Image-generation pushdown is canonicalised as the architectural mechanism for preserving bounded-chain replay after compute-side FPW is disabled. Production measurement: per-read delta-replay count "dropped significantly", p99 latency down 30–50%, p50 down ~30%.

concepts/postgres-full-page-write — the classical reset-point mechanism in Postgres; has a read-path side-effect role this concept names explicitly.
concepts/postgres-checkpoint — the interval that governed FPW cadence in classical Postgres; image-generation pushdown decouples delta-chain-reset cadence from this.
concepts/compute-storage-separation — the architectural context where delta-chain replay becomes a central read-path mechanism.
concepts/wal-record-granularity — WAL structure that enables individual delta-record granularity in chain replay.
systems/pageserver-safekeeper — the storage-tier component that executes delta-chain replay to materialise pages for compute reads.
systems/lakebase — canonical instance of bounded delta-chain replay via image-generation pushdown.
systems/postgresql — upstream substrate whose WAL stream becomes the input to delta-chain replay on separated storage.
patterns/image-generation-pushdown-to-storage — the architectural pattern that keeps delta chains bounded after FPW is disabled on compute.