CONCEPT Cited by 1 source
Storage-engine maturity as data risk¶
Definition¶
Storage-engine maturity as data risk is the framing that the years of production exposure of a database's storage engine is the substrate durability metric for that platform — independent of its advertised feature set, benchmark performance, or architectural novelty. The argument is that storage engines take a long time to get right, and the only honest evidence that a given storage engine is right is how many years of production write-paths have exercised every branch of its code without data-corruption bugs escaping detection.
The framing is orthogonal to ACID compliance: a database can claim ACID compliance and still have storage-engine bugs that corrupt committed data under specific concurrency / recovery / partial-failure conditions. Only years of exposure exercise those conditions.
Canonical PlanetScale framing¶
Sam Lambert (PlanetScale CEO, 2023-06-28):
"MySQL has been serving mission-critical applications at web scale for 28 years. Layering on Vitess, which has served some of the largest sites on the planet for over a decade, you know that every code path has been battle hardened. Database storage engines take a long time to get right. If you are trusting a storage engine that has been around for less than a decade, you are taking extreme risk with your most important asset: your data." (Source: sources/2026-04-21-planetscale-how-planetscale-keeps-your-data-safe)
The "less than a decade" threshold¶
Lambert's cutoff — "less than a decade" — is an empirical claim rather than a principled one. The implicit argument:
- Years 0–5: The storage engine is still absorbing core correctness fixes. Data-loss bugs are routine.
- Years 5–10: Most common bugs are fixed, but rare concurrency / recovery / partial-failure conditions still uncover bugs at scale. Data loss is still possible under edge conditions.
- Years 10+: The storage engine has been exercised by enough workloads across enough years that most rare bugs have been hit and fixed. Data loss is rare.
This framing assumes:
- The storage engine has been broadly deployed during those years — not just academically published. 10 years of a research- project storage engine with 3 production users is less battle- hardened than 3 years of a storage engine with thousands of production users.
- Bug reports flow back to the engine maintainers. Open-source MySQL / Postgres / SQLite have this property. Proprietary engines shipped by a single vendor may not — if the vendor is the only one seeing bugs, the cross-customer learning loop is missing.
- Core engine behaviour is stable. A storage engine that rewrites its core persistence path every 3 years resets its battle-hardening clock.
The implicit positioning shot¶
Lambert's post canonicalises this as a positioning argument against NewSQL and novel storage engines. By 2023-06-28, most NewSQL peers were under or near the decade mark:
- Google Spanner — public 2017 (GA); internally 2012+.
- CockroachDB — v1.0 2017.
- TiDB — v1.0 2017.
- YugabyteDB — v1.0 2018.
- Aurora MySQL — GA 2015.
- Aurora Postgres — GA 2017.
- Amazon DSQL — preview 2024.
All of these fall inside or just past the decade threshold in 2023. Lambert's argument frames MySQL + Vitess (28 + 10+ = 38 combined years) as the battle-hardened baseline, with anything newer carrying "extreme risk".
Composition with other durability primitives¶
The framing is orthogonal to all the other durability primitives:
- Storage
replication lowers the per-write data-loss probability from
PtoP^N. ButPis the storage-engine's per-write bug probability, andP^Nonly bounds loss if bugs are uncorrelated — which isn't true for a correlated-bug class (a bug in the engine's recovery code affects all replicas equally). - Shard-failure isolation bounds the blast radius of a data-corrupting bug to one shard. But it doesn't prevent the bug from hitting every shard eventually.
- Automated backup validation catches backup-process bugs. But it doesn't catch in-memory storage-engine bugs that have already corrupted the primary before the backup ran.
Storage-engine-maturity is the base-rate durability property — what's the probability of a data-corrupting bug existing in the first place. Every other durability primitive operates on top of this base rate.
Caveats¶
- Age is not the only maturity metric. Code-path coverage, test-suite quality, formal verification, and number of production users are all orthogonal metrics. A 30-year-old codebase with weak tests and small user-base is not automatically safer than a 5-year-old codebase with strong tests and a large user-base.
- Features added late reset the local clock. MySQL's InnoDB has had 28 years of general maturity but only a few years for any given specific feature (e.g., functional indexes shipped in MySQL 8.0). Features have their own local maturity clocks.
- Forks and major rewrites partially reset the clock. Amazon Aurora MySQL is built on a heavily-rewritten fork of MySQL's storage path — its maturity is a mix of MySQL's inherited code and Aurora-specific additions, not purely MySQL's clock.
- Newer is not always riskier. Some engines use techniques (formal verification, extensive fuzzing, deterministic testing — FoundationDB's famous simulation harness, for example) that compensate for lower year count. A purely age-based argument doesn't account for these.
Seen in¶
- sources/2026-04-21-planetscale-how-planetscale-keeps-your-data-safe — Canonical wiki introduction of the framing. Sam Lambert (PlanetScale CEO, 2023-06-28) closes the data-safety-envelope post with this argument as the meta-layer underneath the seven mechanical layers (Vitess / MySQL / semi-sync / block storage / safe migrations / validated backups / security). The argument works as a positioning claim against 2023-era NewSQL and novel-storage-engine peers, and as a defensive framing for PlanetScale's continued bet on MySQL rather than switching substrates. Lambert's 2023 framing later pairs with the 2025-era Englander extreme-fault-tolerance post which names storage- engine-maturity as one input into PlanetScale's [[concepts/always- be-failing-over|weekly failover drill]] discipline — the argument being that even a battle-hardened engine should be exercised in its failure modes weekly rather than trusted to work on the day you need it.
Related¶
- concepts/acid-properties — the correctness claim that storage-engine-maturity is the empirical basis for trusting.
- concepts/storage-replication-for-durability — the replication primitive that composes with (but doesn't replace) storage-engine-maturity.
- concepts/blast-radius — what storage-engine bugs affect when they do occur.
- concepts/sharded-failure-domain-isolation — the mechanism that bounds blast radius; complements maturity as a separate durability pillar.
- systems/mysql — the 28-years-of-production reference for this framing.
- systems/innodb — the storage engine underneath MySQL whose maturity is what Lambert is actually citing.
- systems/vitess — the 10+-years-of-production layer on top of MySQL.
- systems/planetscale — the platform canonicalising this framing.