CONCEPT Cited by 3 sources
Evolutionary database design¶
Definition¶
Evolutionary database design is the discipline of treating database schemas as first-class artefacts that evolve incrementally alongside application code, with the same engineering rigour as source-code refactoring: small named transformations applied deliberately, version-controlled migration scripts that travel with application changes, the application and database evolving in lockstep through CI/CD, and "everybody gets their own database instance" during development so experimentation is cheap and isolated. The methodology was articulated by Martin Fowler in the 2003 essay Evolutionary Database Design and operationalised by Pramod Sadalage in the 2006 book Refactoring Databases: Evolutionary Database Design, which catalogs 70+ named database refactorings (Split Column, Move Column, Add Lookup Table, Encapsulate Table With View, Introduce Surrogate Key, etc.) along with the transition mechanics to apply each safely against live data.
The methodology rests on seven practices (Fowler 2003, restated in Sadalage 2006):
- DBAs collaborate closely with developers. Database changes are not gated through a separate review queue; the DBA is on the same team as the developers and reviews changes as part of the feature work.
- Everybody gets their own database instance. Each developer has an isolated database environment to experiment in — see concepts/practice-4-everybody-gets-their-own-database-instance for the canonical wiki page on this practice specifically.
- Developers frequently integrate to a shared master. Schema changes flow into a shared mainline through CI integration, not through long-lived development branches.
- A database consists of schema and test data. The "database" under version control isn't just the DDL — it's the schema plus the reference / test data needed to make the application work.
- All changes are database refactorings. Each schema change is
a named, deliberate, small transformation drawn from a catalog —
not an ad-hoc
ALTER TABLE. - Automate the refactorings. Each refactoring has a script that applies it; manual schema editing is a code smell.
- Version-control everything, including the schema. Migration scripts live in the same repo as application code and are reviewed alongside it.
The twenty-year gap between methodology and substrate¶
Fowler 2003 + Sadalage 2006 articulated the methodology completely; the Continuous Delivery book (Humble & Farley, 2010, Chapter 12 "Managing Data") brought migration scripts into the deployment pipeline — making database-changes-as-code part of the broader CI/CD movement. What CD did not solve was per-pipeline isolation: pipelines could run migrations, but they still needed a target database, and that target was almost always shared. Practice #4 ("everybody gets their own database instance") stayed aspirational on most teams because true per-developer production-shaped databases cost time, money, and DBA cycles.
The post canonicalising this gap (Databricks 2026-05-29):
"The methodology described in Evolutionary Database Design and operationalized in Refactoring Databases: Evolutionary Database Design has been clear for twenty years. The seven practices, the catalog of 70+ named refactorings, the transition mechanics – all of it documented, peer-reviewed, taught."
"That methodology reached CI/CD in 2010 with Continuous Delivery (Chapter 12: Managing Data). Migrations became first-class artifacts in the deployment pipeline. The discipline of database-changes-as-code reached the broader CI/CD movement. What CD didn't solve was per-pipeline isolation: pipelines could run migrations, but they still needed a target database, and that target was shared."
The compensating layer¶
Because Practice #4 stayed aspirational, teams built a compensating layer to work around it:
- Mock objects — the database interface is faked in unit tests; query-planner, constraint-enforcement, and transaction semantics are absent.
- In-memory database substitutes — H2 or SQLite stand in for the production database; SQL dialect drift between substrate and production produces "works on my machine, fails in staging" bugs.
- Shared staging environments — one database serves the whole team; concurrent feature work collides over schema and data; the dev DB becomes a scheduling problem.
- DBA ticket queues — schema changes that require production- shaped test data go through the DBA, who serialises requests through their calendar.
The compensating layer became foundational methodology by default, not by design. Whole bookshelves of advice on testing patterns, mock hierarchies, dialect-translation libraries, and staging-database hygiene exist because Practice #4 was unaffordable, not because the methodology required them.
Database refactoring catalog¶
Sadalage's Refactoring Databases defines a catalog of 70+ named database refactorings, each with a problem statement, mechanics for applying the transformation safely against live data, transition-period strategies (e.g. maintaining old + new schemas during migration), and rollback considerations. Examples:
| Refactoring | What it does |
|---|---|
| Split Column | One column with composite content → multiple typed columns. (Fowler 2003 worked example: Jen splits inventory_code into location_code, batch_number, serial_number.) |
| Move Column | Column moves between tables to follow normalisation or access patterns. |
| Encapsulate Table With View | Table accessed via view, allowing future restructuring without breaking consumers. |
| Introduce Surrogate Key | Replace natural key with auto-generated surrogate. |
| Add Lookup Table | Inline values get factored into a separate reference table. |
| Replace LOB With Table | LOB column → child table for relational access. |
| Merge Columns | Inverse of Split Column — combine related columns into one. |
| Migrate Method From Database | Move stored-procedure logic into application code. |
| Insert Trigger | Add a trigger to maintain invariant during transition. |
| Drop Column | Remove column after consumers have migrated off. |
The full catalog is hosted at databaserefactoring.com, maintained by Sadalage as a living reference.
Why the substrate matters¶
The methodology is substrate-independent — the seven practices and the refactoring catalog work on Postgres, MySQL, Oracle, SQL Server, or any other relational engine. What changes with the substrate is how affordable Practice #4 is.
| Substrate | Practice #4 cost | Compensating layer needed? |
|---|---|---|
| 2003 mainstream (commercial RDBMS, pre-cloud) | Per-developer DB = full provisioning ticket, weeks of DBA work | Yes — heavy mocks + shared staging |
| 2010 cloud-VM era | Per-developer DB = pg_dump + EC2 → still hours, still stale |
Yes — moderate mocks + staging |
| 2020 container era | Per-developer DB = docker run postgres + seed → minutes, but no production-shaped data |
Yes — H2 / SQLite + sometimes staging |
| 2026 copy-on-write era (systems/lakebase / Neon, PlanetScale) | Per-developer DB = sub-second branch from production storage; production-shaped, isolated, governance-propagated | No — compensating layer becomes obsolete |
The 2026 substrate change is what makes Practice #4 operational default, not aspirational. The Databricks 2026-05-29 post argues that the methodology hasn't changed; the capability under it has.
What changes when the constraint lifts¶
Once Practice #4 is affordable:
- The compensating layer comes out. Mock objects, H2/SQLite fakes, shared staging databases, DBA ticket queues — all become optional. See concepts/database-development-compensating-layer for the layer-by-layer disposition.
- The DBA's role shifts from gatekeeper to design collaborator. The DBA pairs with developers on data-integrity, indexing strategy, future extensibility — not on protective gatekeeping. See concepts/dba-as-design-collaborator.
- Schema migration travels in the same PR as the application code that depends on it. See patterns/migration-script-travels-with-application-code.
- CI gets a per-PR ephemeral database branch with full schema
- production-shaped data. See patterns/ci-ephemeral-database-branch-with-schema-diff-comment.
- The refactoring catalog becomes routinely applicable — refactorings that were avoided because of migration cost (column splits, surrogate-key introduction, LOB-to-table extraction) move into the "normal feature work" category.
Seen in¶
-
sources/2026-05-29-databricks-enabling-evolutionary-database-development-database-branching-with-lakebase — First wiki canonicalisation of evolutionary-database-design as a discipline. Databricks 2026-05-29 Tier-3 post (Part 1 of a three-part series). Frames Lakebase's copy-on-write database branching as the substrate that finally makes Practice #4 operationally default. Re-uses Fowler's 2003 protagonist Jen and the Split Column refactoring as the worked example: same Jen, same refactoring, what changed is the capability. Names the compensating layer (mock objects, in-memory DBs, shared staging, DBA tickets) as "foundational methodology by default, not by design". Forward- references Part 2 (architecture deep-dive) and Part 3 (50- developer governance + agent-in-the-loop).
-
sources/2026-06-05-databricks-enabling-evolutionary-database-development-database-branchin — Part 2: The new playbook. Expands the original 7 practices to 11, adding: Practice #8 (destructive testing as default), Practice #9 (A/B variant prototyping at database level), Practice #10 (governance-inherited-by-branches, deferred to Part 3), Practice #11 (agent-as-practitioner, deferred to Part 3). Introduces idempotent migration as the hard authorship rule for Practice #3, and names expand-and-contract as the canonical schema migration strategy. Provides full CI/CD workflow mechanics via GitHub Actions templates.
-
sources/2026-06-12-databricks-enabling-evolutionary-database-development-database-branchin-part3 — Part 3: Team scale + agents. Delivers on Practice #10 (governance designed once, inherited per branch) and Practice #11 (agent-as-practitioner inside an SCM state machine with blocking gates). Introduces the tier topology (environments as branches), DBA-to-platform-engineer role evolution, the artifact-as-API model for multi-agent coordination, and an opt-in TDD layer with six per-role agents. Ships as the Lakebase App Dev Kit.
Related¶
- concepts/practice-4-everybody-gets-their-own-database-instance — the practice that the 2026 substrate makes affordable.
- concepts/database-development-compensating-layer — the bookshelf of mocks / H2 / staging / DBA-tickets that becomes obsolete.
- concepts/dba-as-design-collaborator — the role evolution that follows.
- concepts/database-branching — the substrate primitive that enables Practice #4.
- concepts/copy-on-write-storage-fork — the storage mechanism underneath cheap branching.
- concepts/versioned-schema-migration · concepts/schema-evolution · concepts/up-down-migration-pair — the migration-tooling axis the methodology requires.
- concepts/integration-tests-against-real-database — the testing discipline that becomes affordable.
- patterns/branch-based-schema-change-workflow — PlanetScale's earlier instance of an evolutionary-database-design product surface.
- patterns/per-developer-database-branch-paired-with-code-branch · patterns/migration-script-travels-with-application-code · patterns/ci-ephemeral-database-branch-with-schema-diff-comment — the three patterns that operationalise Practice #4 on a copy-on-write substrate.
- patterns/sequential-numbered-migration-files — the migration- tool format the methodology uses.
- systems/lakebase — the canonical 2026 substrate.