SYSTEM Cited by 3 sources
Pageserver + Safekeeper (Lakebase / Neon storage tier)¶
Pageserver and Safekeeper are the two server components of the Neon-lineage Postgres storage tier — the durable, long-lived storage layer that sits underneath ephemeral Postgres compute VMs in a separated-compute-and-storage managed-Postgres architecture. systems/lakebase inherits them from Neon (acquired by Databricks in 2025).
They are the "storage side" half of the concepts/compute-storage-separation split that makes scale-to-zero Postgres compute feasible.
What each does¶
- Pageserver. Owns page-level durable state. Answers "give me the contents of page P of relation R at LSN L" against its own object-storage-backed data + local caches. Compacts and organises WAL into a page-addressable form so that newly-booted compute instances can materialise any page on demand without replaying the whole log. Effectively the durable page store + read-side cache tier.
- Safekeeper. Owns the durable WAL (Write-Ahead Log). Postgres compute writes WAL segments to the Safekeeper; the Safekeeper is responsible for persisting them before acknowledging the commit. Typically run as a replicated group to tolerate node failures at commit time. Segments eventually flow downstream to the Pageserver for compaction into page-addressable form, and to long-term object storage.
Together they externalise both the page heap and the WAL out of the Postgres compute VM, so the compute VM holds only buffer pool, connection state, query plans, and other scratch — and can therefore scale to zero.
Ingested-source references¶
The only source in the corpus naming both components so far is the Lakebase CMK launch:
The storage layer (Pageserver and Safekeeper) maintains long-lived, persistent data in object storage and local caches, while the compute layer runs independent Postgres instances that scale up, down, or to zero based on demand.
(Source: sources/2026-04-20-databricks-take-control-customer-managed-keys-for-lakebase-postgres)
That post also positions both components as being under the same concepts/envelope-encryption hierarchy — all data segments managed by Lakebase, including WAL segments stored by Safekeeper and data files stored by Pageserver, are encrypted with DEKs wrapped by KEKs under the customer's CMK.
Why the split matters¶
- Compute scale-to-zero. Durable state does not live on the compute VM; a VM can be terminated and re-created without data loss. Cold start becomes a matter of re-attaching to Pageserver / Safekeeper, not rebuilding state from scratch. See concepts/stateless-compute.
- Independent scaling axes. Storage (capacity + durability) and compute (CPU + memory) scale on separate curves, like Snowflake and Aurora DSQL at different layers. See concepts/compute-storage-separation.
- Replication / HA reshaped. Safekeeper handles commit-time durability; read HA is absorbed into Pageserver replicas + object- storage. Compute HA is "just start another VM", because there's no compute-local durable state to replicate.
- Encryption boundary is clean. The storage tier encrypts everything under DEKs wrapped by the envelope hierarchy; the compute tier deals separately with scratch plaintext via patterns/per-boot-ephemeral-key.
Image-generation pushdown on the pageserver (2026-05-07)¶
Second canonical Lakebase source disclosing a specific pageserver responsibility beyond name-level framing:
When Postgres compute requests a page from storage, the pageserver (a component of the Lakebase distributed storage system) reconstructs it by finding the most recent materialized image of that page and replaying any WAL deltas on top. … The pageserver now generates full page images when a page has accumulated more delta records than a configured threshold without an intervening image.
Two distinct pageserver roles canonicalised:
- Page-reconstruction on read — materialise a page by finding the most recent image and replaying accumulated WAL deltas. This is the delta-chain replay mechanism and it runs on the request path.
- Image-generation on background threshold — periodically generate a new full page image when the delta chain exceeds a configured threshold. This runs out-of-band from the read path as background storage-tier work, horizontally shared across multiple pageservers per project branch. This role is the load-bearing change in the 2026-05-07 post; before this change, image-generation was implicitly outsourced to the compute's FPW stream, which generated images on an unrelated (checkpoint-scoped) cadence.
See patterns/image-generation-pushdown-to-storage for the generalised architectural pattern.
Safekeeper durability: Paxos-based quorum (2026-05-07)¶
[Compute] streams WAL to a Paxos-based quorum of safekeepers.
First canonical wiki disclosure of the safekeeper's underlying replication primitive: Paxos-based quorum write-ack for incoming WAL. This is what makes the safekeeper group tolerant to individual node failures at commit time without the compute side having to re-stream WAL. It is also the durability substrate that allows compute to be stateless (no local data directory needed) and makes torn pages a failure mode that structurally does not exist for compute.
Rolled-out-live via Postgres XLOG_FPW_CHANGE record¶
The pageserver's protocol contract with compute changed
(pre-2026-05: receive FPW records from compute; post-2026-05:
receive only compact deltas + generate images locally). Rollout
across the global fleet used the pre-existing Postgres
XLOG_FPW_CHANGE control record as an in-log feature flag —
see patterns/live-wal-protocol-switch-via-xlog-fpw-change.
~6-week rollout window (late March → 2026-05-07), zero customer
restarts.
Caveats¶
- Ingested-source coverage is still thin: we know these components exist inside Lakebase and that they sit under the CMK envelope, but we don't have a Databricks or Neon engineering deep-dive ingested yet. Commit latency, failover, compaction cadence, and replication-factor configurability are not described here.
- Lineage note: the naming "Pageserver / Safekeeper" comes from Neon's open-source architecture and is adopted by Lakebase; the Neon acquisition itself was previously logged as a skipped PR announcement (2026-04-21 log entry).
Seen in¶
- sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes
— Second canonical Lakebase source; first mechanism-level
disclosure of the pageserver's internals. Canonicalises two
distinct pageserver responsibilities: (a) page-reconstruction on
read via delta-chain replay over
the most recent materialised image, (b) image-generation on
background threshold — the pageserver generates a new full
page image when a page has accumulated more delta records than
a configured threshold without an intervening image. This
background role replaces the compute's
FPW as the reset-point
mechanism for delta chains after compute-side FPW is disabled.
Work is horizontally shared across multiple pageservers per
project branch. First wiki disclosure of the Paxos-based
quorum durability primitive on the safekeeper side, and of
the Postgres
XLOG_FPW_CHANGEWAL record as the in-log feature flag used to switch the compute-storage protocol contract atomically per-compute across the global fleet (~6-week rollout, zero customer restarts). Measured outcomes on HammerDB TPROC-C: 94% compute WAL volume reduction; 5× write throughput at 32 vCPU; p99 read latency −30% to −50% via better-targeted image-generation cadence. See patterns/image-generation-pushdown-to-storage for the generalised architectural pattern and patterns/live-wal-protocol-switch-via-xlog-fpw-change for the rollout mechanism. -
sources/2026-04-20-databricks-take-control-customer-managed-keys-for-lakebase-postgres — Pageserver and Safekeeper named as Lakebase's durable-storage layer; both are brought under the CMK envelope-encryption hierarchy.
-
sources/2026-05-27-databricks-how-the-lakebase-architecture-stays-resilient-to-cloud-failures — Third canonical Lakebase source; first wiki disclosure that the Pageserver+Safekeeper substrate is structurally zone-redundant for every Lakebase / Neon database, regardless of tier. Verbatim: "Monolithic Postgres setups are usually backed by local block devices that are rarely zone-redundant. This necessitates physical replication and costly hot standby replicas across multiple availability zones. In Lakebase and Neon, all databases, regardless of tier and configuration, are backed by distributed, zone-redundant, highly available storage. Data is stored in highly durable, zone-redundant object storage, and performance is accelerated by NVMe SSD caches across multiple availability zones at no additional cost to you." Two new storage-tier disclosures: (a) the NVMe SSD cache layer is itself multi-AZ — performance acceleration doesn't sacrifice zone-redundancy on the read side; (b) the default-on-for-all-tiers property — single-compute Postgres customers get the same zone-redundant durability as HA-tier multi-compute customers, with HA-tier adding compute redundancy on top. The architectural payoff: "a single-compute Postgres instance in Lakebase has significantly improved availability compared to a single stateful Postgres instance, without the cost of an additional hot standby compute instance" — eliminates the hot-standby-tax. Crash-recovery elimination disclosure: the stateless-Postgres-on-zone-redundant-storage shape replaces both hot-standby replication and WAL-replay-from-checkpoint crash recovery ("can take 10s of minutes, depending on configuration"). See concepts/zone-redundant-storage for the dedicated concept and concepts/stateless-compute for the partner property.
Related¶
- systems/lakebase — the managed-Postgres service that consumes these components.
- systems/postgresql — the upstream DB engine they externalise storage for.
- systems/aurora-dsql — a different structural answer to the same "separate Postgres state from compute" problem (Aurora DSQL swaps individual Postgres subsystems via extensions; Neon-lineage Pageserver+Safekeeper pulls the whole page/WAL tier out).
- concepts/compute-storage-separation — the architectural principle this realises for Postgres.
- concepts/envelope-encryption — the encryption model covering both components.