PATTERN Cited by 1 source

Pre-allocated bare-metal pool with vertical-autoscaling virtualization¶

The pattern¶

Pre-provision a pool of large (often bare-metal) cloud instances with provisioning headroom, then run a purpose-built vertical-autoscaling virtualisation layer that schedules the service's per-tenant compute (e.g. one Postgres database compute per tenant) onto the pool. Customer-tenant compute is started by the in-house virtualisation layer, bypassing the cloud-provider VM control plane on the per-tenant hot path.

The pattern is a concrete realisation of critical-path dependency minimisation applied to the start verb under agentic / scale-to-zero workloads — see concepts/control-plane-as-the-new-data-plane for the workload-shape forcing function.

The three components¶

Bare-metal pool with provisioning buffer. Allocate fewer, larger instances rather than many small ones. Carry capacity buffer beyond steady-state demand to absorb cloud-provider provisioning outages — "We carry buffers to sustain cloud provider provisioning outages." (Source: sources/2026-05-27-databricks-how-the-lakebase-architecture-stays-resilient-to-cloud-failures)
Vertical-autoscaling virtualisation layer. "Schedules multiple Postgres instances onto those cloud instances"; auto- scales a tenant's allocation up and down based on observed load. Vertical (vs horizontal) because each tenant gets a variable slice of one host's resources — densification rather than spread.
Own zone-resilient storage substrate. "We don't rely on cloud block store devices, but instead store data in our own zone-resilient storage that is ultimately backed in object stores." See concepts/zone-redundant-storage + systems/pageserver-safekeeper.

The three components together replace five cloud-provider control- plane dependencies (compute / VM-capacity-policy / block / network / Kubernetes-system-services) with a single in-house data path.

Lakebase / Neon canonical instance¶

Verbatim (Source: sources/2026-05-27-databricks-how-the-lakebase-architecture-stays-resilient-to-cloud-failures):

"In Lakebase, we take a different approach that drastically reduces the amount of control plane machinery involved in critical database flows:

We allocate a pool of big (often bare metal) instances from the cloud provider. We carry buffers to sustain cloud provider provisioning outages.

We built our own vertically autoscaling virtualization layer that schedules multiple Postgres instances onto those cloud instances.

We don't rely on cloud block store devices, but instead store data in our own zone-resilient storage that is ultimately backed in object stores like S3 or Azure Blob storage."

Why this pattern shape vs alternatives¶

Alternative	Trade-off
Cloud-provider per-tenant VMs	Simpler, but every tenant start chains through cloud-provider compute / block / network control planes
Kubernetes-managed pods on shared nodes	Adds Kubernetes system-services dependency on the start path
Single-tenant Postgres on dedicated VM (always-on)	Forfeits scale-to-zero economics and density
Pre-allocated bare-metal pool + in-house virtualisation	Highest reliability + highest density + biggest engineering investment

The pattern is the right shape when:

Tenant count >> instance count — densification is needed economically.
Cold-start frequency is high — start path is on the request path of every connection arrival under scale-to-zero.
Reliability target is 99.99%+ — five-link cloud-provider control-plane chain alone consumes the budget.
Engineering capacity exists to build + operate the virtualisation layer — not free; this is the cost of the pattern.

Composability with cell-based architecture¶

The pattern composes naturally with cell-based architecture: each cell carries its own bare-metal pool with its own buffer; cell-level capacity exhaustion is contained to that cell. Cross-cell overflow is a separate design choice (typically not done — cells are independent on purpose).

Static-stability framing¶

The pattern is a statically stable instantiation: the pool with buffer is the "absorb failure without fetching new resources" primitive — when the cloud provider's compute control plane has an outage, the pool already has the instances, and the virtualisation layer can keep starting tenant compute from the buffer without a hot-path call to the cloud-provider control plane.

Caveats¶

Build cost. Owning a vertical-autoscaling virtualisation layer is non-trivial engineering — kernel-level isolation, fair scheduling across noisy neighbours, density-vs-isolation trade-offs.
Buffer-sizing is a live calibration problem. Too small bleeds cloud-provider outages through to customers; too large is wasted capex. Sizing depends on assumed outage-duration distribution.
Failure-mode novelty. A bug in the in-house virtualisation layer is a new failure surface that doesn't exist with cloud- provider VMs. Quality bar must be high.
Cross-tenant noisy-neighbour. Multiple Postgres on one bare-metal instance must isolate IO, CPU, and memory. The isolation primitives are load-bearing.
Bare-metal instance churn. When a bare-metal instance fails, N tenants are affected at once vs 1 with per-tenant VMs. The cell-level orchestration must handle this rebalancing efficiently.
Cloud-provider primitives still on the replenishment path. The pool is replenished via the cloud-provider compute control plane — sustained cloud-provider outages will eventually deplete the buffer.
Specific virtualisation primitives not detailed in source. The Lakebase post links separately to the Neon autoscaling-architecture docs but does not detail the kernel-level isolation mechanisms, scheduler design, or noisy-neighbour mitigations.

Seen in¶

sources/2026-05-27-databricks-how-the-lakebase-architecture-stays-resilient-to-cloud-failures — canonical Lakebase / Neon framing. The five-link cloud-provider control-plane chain is enumerated verbatim; the three-component architectural reply is enumerated verbatim. The Neon autoscaling-architecture page is linked as the architecture-detail reference.

concepts/critical-path-dependency-minimization — parent concept; the discipline this pattern operationalises
concepts/control-plane-as-the-new-data-plane — the workload-shape forcing function
concepts/static-stability — the buffer-of-pool primitive is a static-stability instantiation
concepts/availability-multiplication-of-dependencies — the mathematical framing of why dependency-chain length matters
concepts/database-startup-time-sli — the SLI the pattern optimises for
systems/lakebase / systems/neon — canonical instances
systems/aws-ec2 — the cloud-provider compute primitive being buffered against
patterns/separate-data-plane-controller-for-hot-path — companion pattern; the hot-path controller drives the virtualisation layer