title: "Take Control: Customer-Managed Keys for Lakebase Postgres" type: source created: 2026-04-21 updated: 2026-04-21 company: databricks tier: 3 url: https://www.databricks.com/blog/take-control-customer-managed-keys-lakebase-postgres published: 2026-04-20 tags: [security, encryption, kms, key-management, customer-managed-keys, lakebase, postgres, compute-storage-separation, compliance] systems: [lakebase, pageserver-safekeeper, aws-kms, azure-key-vault, google-cloud-kms, postgresql, unity-catalog] concepts: [envelope-encryption, cmk-customer-managed-keys, cryptographic-shredding, compute-storage-separation, stateless-compute, control-plane-data-plane-separation] patterns: [per-boot-ephemeral-key] summary: Databricks rolls Customer-Managed Keys (CMK) out to Lakebase, its serverless Neon-descended Postgres offering. The post is unusual for a Databricks security launch in that it leans on Lakebase's compute/storage-separated architecture — persistent Pageserver/Safekeeper layers plus ephemeral Postgres compute VMs — to motivate a three-level envelope-encryption hierarchy (CMK → KEK → DEK) where the CMK stays in the customer's cloud KMS (AWS / Azure / GCP) and revocation works as cryptographic shredding: key-unwrap fails → storage inaccessible, compute terminated, in-memory DEKs destroyed. Ephemeral compute adds a per-boot key that binds local disk + caches to instance lifetime.
Take Control: Customer-Managed Keys for Lakebase Postgres¶
Summary¶
Databricks launches Customer-Managed Keys (CMK) for systems/lakebase, its serverless managed-Postgres offering. The technical interest is less the CMK feature itself — which every enterprise-tier cloud DB ships — than the architectural fit between envelope encryption and a compute/storage-separated DB. Lakebase, descended from the Neon architecture it [acquired in 2025], splits durable storage (Pageserver + Safekeeper in object storage + local caches) from ephemeral Postgres compute that can scale to zero. Both layers, plus every cache in between, have to stay under customer key control. The post describes the three-level concepts/envelope-encryption hierarchy (CMK → KEK → DEK), how concepts/cryptographic-shredding works as the revocation model across both tiers, and how per-boot ephemeral keys handle compute-VM-local data. Integrates with AWS KMS, Azure Key Vault and GCP KMS behind one control surface; operations logged to each provider's audit trail (CloudTrail / Azure Monitor / Google Cloud Audit Logs).
Key takeaways¶
-
Lakebase's compute/storage separation is the encryption-design forcing function. Lakebase inherits Neon's split: durable state (Pageserver for pages, Safekeeper for WAL segments) persists in object storage + local caches; Postgres compute VMs are ephemeral, scaling up/down/to-zero. A traditional managed DB can encrypt one storage layer and be done; Lakebase has two layers (persistent and ephemeral) plus all of their caches, and each must be under customer control. This directly motivates the hierarchical key model. (Source: sources/2026-04-20-databricks-take-control-customer-managed-keys-for-lakebase-postgres)
-
Three-level envelope encryption: CMK → KEK → DEK. (1) Customer Managed Key (CMK) lives in the customer's cloud KMS (AWS KMS, Azure Key Vault, Google Cloud KMS) — Databricks never sees plaintext. (2) Key Encryption Key (KEK) is a transient key used by Databricks' Key Manager Service to wrap data keys. (3) Data Encryption Keys (DEKs) are unique per data segment and stored alongside the data in wrapped form. The KMS is contacted only to unwrap keys, not to encrypt/decrypt every block — this is the performance-at-scale property of the envelope model. See concepts/envelope-encryption.
-
Revocation as concepts/cryptographic-shredding. Revoking the CMK in the customer's KMS makes subsequent unwraps fail; wrapped DEKs sitting next to the data become useless without the CMK; the data is cryptographically inaccessible without ever being physically deleted. On the compute side the Lakebase Manager additionally terminates the ephemeral Postgres instance, destroying in-memory keys and rendering local-disk data inaccessible. This is the structural reason envelope encryption is an enabler, not merely a performance optimisation.
-
Per-boot ephemeral keys bind compute-VM-local state to instance lifetime. Every Postgres compute VM generates a unique ephemeral key at boot, used to protect OS/Postgres scratch state (performance caches, WAL artifacts, temp files). On CMK revocation, the Lakebase Manager kills the instance; the per-boot key dies with memory; local disk is unrecoverable. See patterns/per-boot-ephemeral-key — a natural fit for concepts/stateless-compute / scale-to-zero tiers where the compute layer holds state that must not outlive the instance.
-
Seamless key rotation is a property of the envelope hierarchy, not of the storage layer. Rotating the CMK at the KMS does not require re-encrypting data or regenerating DEKs — you only need to re-wrap the KEKs/DEKs against the new CMK version, which is cheap. Zero downtime, zero bulk re-encryption. The hierarchy separates key lifetime from data lifetime.
-
One Databricks workflow, three cloud KMSes. Account Admin creates a Key Configuration (key identifier — ARN for AWS KMS, Key Vault URL for Azure, Key ID for Google Cloud KMS — plus the IAM role / service principal Lakebase assumes for Wrap/Unwrap). Configuration is bound to a Workspace; new Lakebase projects in the workspace inherit it. Different workspaces can use different CMKs for multi-tenant / multi-departmental isolation. The control-plane/data-plane split (concepts/control-plane-data-plane-separation) is visible here: Account Console owns key configuration, each Workspace's data plane consumes it.
-
Auditability lives in the customer's cloud. Every wrap/unwrap hits the customer's KMS, so the audit record is in the customer's tenancy (CloudTrail / Azure Monitor / Google Cloud Audit Logs), not Databricks'. This is the security-compliance payoff of keeping the CMK in the customer's KMS — attestation doesn't have to cross the vendor trust boundary.
-
Account↔Workspace delegation encodes separation of duties. Security admins manage keys without needing data access; data users operate on data without needing key-management privilege. Workspace binding is the enforcement point. (Noted but not expanded in a dedicated wiki page — the delegation shape is a standard Databricks Account↔Workspace idiom.)
Extracted — systems¶
- systems/lakebase (new) — Databricks' serverless Postgres service; Neon-descended compute/storage-separated architecture.
- systems/pageserver-safekeeper (new) — the Neon-lineage storage layer Lakebase inherits: Pageserver (page-level durable state in object storage + local caches) + Safekeeper (durable WAL-segment storage).
- systems/aws-kms (new stub) — AWS Key Management Service; one of three cloud KMSes Lakebase CMK supports.
- systems/azure-key-vault (new stub) — Azure's managed KMS.
- systems/google-cloud-kms (new stub) — GCP's managed KMS.
- systems/postgresql — Lakebase's upstream engine (same Postgres-extension posture Databricks shares with systems/aurora-dsql).
Extracted — concepts¶
- concepts/envelope-encryption (new) — multi-level key hierarchy where data keys are themselves encrypted by higher-level keys; KMS contacted only for unwraps; the general distributed-systems pattern for performant, scalable data-at-rest encryption under customer key control.
- concepts/cmk-customer-managed-keys (new) — keys owned by the customer in their cloud KMS tenancy; vendor never sees plaintext; revocation is the contract that makes "data sovereignty" concrete.
- concepts/cryptographic-shredding (new) — making data inaccessible by destroying (or denying access to) the keys that decrypt it, rather than physically deleting the bytes; the revocation primitive under envelope encryption.
- concepts/compute-storage-separation — the Lakebase architecture that forces two-tier encryption and motivates the per-boot ephemeral key pattern.
- concepts/stateless-compute — ephemeral Postgres compute VMs are the stateless-compute tier; per-boot keys make statelessness cryptographic.
- concepts/control-plane-data-plane-separation — Account-Console key configuration (control) vs. Workspace Lakebase projects (data).
Extracted — patterns¶
- patterns/per-boot-ephemeral-key (new) — per-instance key generated at boot, bound to VM lifetime, destroyed on termination; the crypto hook for making scale-to-zero compute tiers key-revocable.
Operational numbers / facts¶
- Supported cloud KMSes: AWS KMS, Azure Key Vault, Google Cloud KMS.
- Key-identifier formats per KMS: ARN (AWS), Key Vault URL (Azure), Key ID (GCP).
- Availability: Enterprise tier customers only.
- No latency / throughput numbers given; no quantitative "CMK overhead" disclosed. The post is a capability narrative, not a perf-characterisation.
Caveats¶
- Vendor-blog-on-vendor-product shape. This is a launch post; all claims are self-reported. Calling out particularly:
- "Zero downtime or manual re-encryption required" for CMK rotation is a property of the envelope model, so structurally plausible, but no rotation drill outcomes or rotation-at-rate numbers are given.
- "Automatic Shredding" of ephemeral data on revocation relies on Lakebase Manager terminating all compute instances in scope, including any in-flight connections; behaviour under partial-network partitions where Manager cannot reach instances is not described.
- Post does not disclose the KEK cache / rotation cadence. Standard envelope-encryption designs cache KEKs for short windows to avoid round-tripping the KMS on every DEK unwrap; the Databricks Key Manager Service implementation details are not given.
- No mention of tenant-level key isolation inside a single workspace. The workspace is the unit of key binding; per-project or per-table CMK is not described in this post.
- Lakebase architecture details compressed. The post doesn't cite
the Neon origin by name; "Pageserver / Safekeeper" terminology is
used because the architecture inherits directly from Neon (acquired
2025). This inference is drawn from the Neon-acquisition PR
previously logged as skipped in
log.mdon 2026-04-21 10:38 — worth noting if a future ingest surfaces Lakebase's internals independently.
Raw source¶
- File:
raw/databricks/2026-04-20-take-control-customer-managed-keys-for-lakebase-postgres-208cf8fc.md - URL: https://www.databricks.com/blog/take-control-customer-managed-keys-lakebase-postgres
- Fetched: 2026-04-21