Skip to content

SYSTEM Cited by 1 source

Lakebase

Lakebase is Databricks' serverless, Postgres-compatible OLTP database offering. It descends architecturally from Neon, the separated-storage-compute Postgres that Databricks acquired in 2025, and is positioned as the transactional companion to the analytical side of Databricks' Data Intelligence Platform.

Minimum viable framing for this wiki: Lakebase is a managed Postgres where the persistent state (pages + WAL) lives in systems/pageserver-safekeeper on object storage + local caches, and the Postgres compute VMs on top are ephemeral — they scale up, down, or to zero based on demand (concepts/stateless-compute). Each ingested source surfaces a different slice of its design; fill in here as they land.

Architecture slice — storage/compute separation

From the CMK launch post (only Lakebase post ingested so far):

  • Persistence layer (storage). Long-lived state in object storage and local caches, served by the Pageserver and Safekeeper components. The Pageserver owns page-level durable state; the Safekeeper owns the durable WAL. Both are independent of compute and persist across compute instance lifetimes.
  • Compute layer. Independent Postgres VMs that can scale up, down, or to zero based on demand. These are ephemeral — they hold only scratch state (buffer pool, WAL artifacts in transit, temp files, performance caches).
  • Lakebase Manager. The control-plane component that starts, stops, and terminates compute instances (e.g. on CMK revocation).

This shape is the same conceptual split as Neon's public architecture and is the forcing function behind Lakebase's two-tier encryption story. (Source: sources/2026-04-20-databricks-take-control-customer-managed-keys-for-lakebase-postgres)

Capabilities surfaced so far

Customer-Managed Keys (CMK) for Lakebase (2026-04-20)

  • Three-level concepts/envelope-encryption hierarchy: CMK in customer's cloud KMS → KEK in Databricks Key Manager Service → DEK per data segment.
  • Supported KMSes: systems/aws-kms, systems/azure-key-vault, systems/google-cloud-kms (identified by ARN, Key Vault URL, Key ID respectively).
  • Persistent layer encryption via the envelope hierarchy over Pageserver + Safekeeper data and WAL segments.
  • Ephemeral layer encryption via per-instance per-boot ephemeral keys for Postgres-VM-local state; CMK revocation triggers Lakebase Manager to terminate the compute instance.
  • concepts/cryptographic-shredding is the revocation semantics across both layers.
  • Seamless key rotation (no bulk re-encryption) is a property of the envelope hierarchy.
  • Account↔Workspace delegation: Account Admin creates Key Configuration, binds to Workspace; new Lakebase projects in the workspace inherit the CMK.
  • Availability: Enterprise tier customers.

Relationship to other Databricks / wiki systems

  • systems/postgresql — Lakebase is Postgres-compatible; compatible posture with systems/aurora-dsql's "extend, don't fork" idiom though the internal architecture differs (Aurora DSQL swaps concurrency/durability/storage via Postgres extensions; Neon- lineage Lakebase splits page/WAL storage off as Pageserver + Safekeeper services).
  • systems/pageserver-safekeeper — the storage-tier components Lakebase inherits from Neon.
  • systems/unity-catalog — Databricks' governance substrate; the Account-Console key-configuration flow is the same Account ↔ Workspace shape UC is administered through.

Caveats

  • Every statement here is sourced from one launch post; Lakebase's own internals (replication, HA, scale-to-zero cold-start times, commit protocol, compute autoscaling policy) have not been documented in the ingested corpus yet. Future ingests should expand this page.
  • Neon-lineage is inferred from the Pageserver / Safekeeper terminology used in the CMK post; Databricks' Neon acquisition (2025) was previously logged as skipped in log.md (pure PR) and is not a formally ingested source.

Seen in

Last updated · 200 distilled / 1,178 read