CONCEPT

Progressive delivery per database¶

Definition¶

Progressive delivery per database is the fleet-rollout discipline for a database-as-a-service vendor where configuration and binary changes ship one customer database at a time, gated by feature flags, with dev- branch environments catching changes before production- branch environments by a minimum soak period (at PlanetScale, one week).

Max Englander's canonical framing ():

"Data plane changes are shipped gradually to progressively critical environments. Database cluster config and binary changes are shipped database by database using feature flags. Release channels allow us to ship changes to dev branches first, and to wait a week or more before shipping those same changes to production branches. Minimizes the impact of our own mistakes on our customers."

Distinguishing property: per-tenant-cell granularity¶

Generic progressive-delivery primitives — canary deployment, blue-green, percentage-rollout, cohort-split — operate at the traffic altitude: a percentage of requests land on the new version. For a multi-tenant database-as-a-service vendor, the useful granularity is different:

Traffic-percentage doesn't map: each customer's traffic goes to their cluster, not a pooled fleet. A 1% cohort-rollout means "1% of customers, all of their traffic" — which is a much scarier blast radius than "1% of one customer's traffic".
Per-tenant-cell granularity ("one customer database at a time") is what bounds blast radius at the natural isolation boundary of the product. Englander's consequence: "A bug in Vitess or the PlanetScale Kubernetes operator rarely impacts more than 1-2 customers."

The per-tenant-cell granularity only works because the product is physically partitioned per tenant — each customer has their own cluster, their own VMs, their own storage. On a multi-tenant shared-cluster product (e.g. a Kubernetes control plane, a shared SaaS DB), the progressive-delivery shape is different — there's no per-tenant-cell to roll.

Three orthogonal rollout dimensions¶

Englander's framing names three compositional rollout dimensions:

1. Per-environment critically ladder¶

"shipped gradually to progressively critical environments". Canonical ladder on a DBaaS: internal dogfooding → development environments → staging → free- tier production → paid-tier production → enterprise-tier production. Each step widens the blast radius of a bad change; advancing requires signal from the previous step.

2. Per-database feature-flag gate¶

"Database cluster config and binary changes are shipped database by database using feature flags". Each change is gated by a feature flag; enabling a flag on one database is the atomic unit of rollout. Rollback = flip the flag off. Enables:

Surgical disable when telemetry surfaces a problem affecting a specific database
Bug-repro on demand by enabling a flag on a consenting customer
Gradual rollout by flipping the flag on more databases over time

3. Release channels with soak periods¶

"Release channels allow us to ship changes to dev branches first, and to wait a week or more before shipping those same changes to production branches". Dev-branch is the soak environment — changes arrive there first, bake for a week-or-more, and only propagate to production- branch if no regressions surface in that window. The week-or-more soak is a calendar-time lower bound independent of the per-database rollout pace.

Blast-radius calculus¶

The composition of the three dimensions gives:

worst-case blast radius =
    (change shipped to dev branches only) : 0 production impact
    (change shipped to production, one DB at a time) : 1 customer at a time until rollback
    (after a full week of soak) : whatever regressions exist are well-mapped

Englander's load-bearing claim: "rarely impacts more than 1-2 customers". The 1-2 customers are the ones whose flag flipped just before the telemetry surfaced the regression; after that, the flag is rolled back and no further customers see the change.

Distinguished from adjacent patterns¶

vs patterns/cohort-percentage-rollout. Cohort percentage rolls to a random percentage of traffic; per-database progressive delivery rolls to one tenant cell at a time. Complementary at different altitudes — PlanetScale could combine the two (roll to 1% of databases selected deterministically, roll those to 100% of traffic on those databases).
vs patterns/staged-rollout. Staged rollout is the generalisation; per-database is the DBaaS-specific specialisation — the stages are databases rather than regions or datacenters or percentages.
vs patterns/gradual-rollout-layered-by-stack-depth. That pattern rolls by depth in the software stack (kernel → userspace → framework → application); per-database rolls by tenant breadth (DB 1 → DB 2 → ... → all DBs). Orthogonal dimensions.

Substrate requirements¶

Per-database progressive delivery requires:

Per-tenant physical isolation. Without it, there is no "per-database" unit to roll.
Feature-flag system with per-tenant granularity. The flag evaluation context must include the database identity; the flag store must scale to (flags × DBs).
Telemetry disaggregated to per-database granularity. Canary metrics must attribute regressions to the set of databases currently flipped; fleet-aggregated metrics hide a 1-customer regression in the noise.
Automated rollback. Manual rollback doesn't scale to fleet sizes; flags must flip back automatically on regression detection.

Seen in¶

**** — canonical verbatim framing as the third of three reliability processes (alongside concepts/always-be-failing-over and synchronous replication). 1-2-customers blast-radius datum sourced from this post.

concepts/feature-flag — the substrate primitive
concepts/blast-radius — what per-database granularity caps
concepts/isolation-as-fault-tolerance-principle — the principle per-database progressive delivery rides on
concepts/static-stability — the reliability principle per-database progressive delivery supports
patterns/always-be-failing-over-drill — the sibling process that uses the same flag-gated per-database rollout as its vehicle
patterns/staged-rollout — the generalisation
patterns/cohort-percentage-rollout — traffic- percentage sibling at a different altitude
patterns/gradual-rollout-layered-by-stack-depth — stack-depth sibling at a different altitude