Skip to content

CONCEPT Cited by 2 sources

Scaling ladder

Scaling ladder is the canonical progression of database-scaling levers, ordered from cheapest-lowest-impact to most-expensive-highest-impact. Each rung preserves more of the original application surface than the next; you climb it one rung at a time, only as far as a specific trigger signal justifies.

The rungs

  1. Vertical scaling — bigger single instance. Zero application change. Ceiling: economic cost, memory contention, IOPS limits, working set overflow.
  2. Read-replica scale-out — route reads to replicas, writes to primary. Requires app-level routing logic and tolerating replication lag. Scales reads, not writes.
  3. Vertical sharding / vertical partitioning — move table groups onto separate clusters. Each table still whole on one instance; relational semantics preserved per cluster; cross-cluster JOINs become structurally difficult without a framework like Vitess.
  4. Horizontal sharding — split a single hot table's rows across many shards. Unbounded scale; maximal complexity (shard-key selection, proxy tier, cross-shard queries, per-shard replica sets).

Why the ordering matters

Each rung preserves more of the previous state than the next:

  • Rung 1 → Rung 2: app adds connection-string routing (primary vs replica).
  • Rung 2 → Rung 3: app adds per-cluster routing for whole table groups.
  • Rung 3 → Rung 4: the hot table gets a shard key; every query on it must include or tolerate routing by that key.

The non-uniform growth assumption"only a small subset of these tables grow very large" (Dicken) — is the structural reason to climb one rung at a time. Rung 3 (vertical sharding) isolates the hot table to its own cluster without sharding the cold tables; rung 4 (horizontal sharding) shards only the hot table once its single-cluster ceiling is reached.

Trigger signals for each rung

Berquist's (Guide to scaling your database) framing canonicalises the three sharding-trigger signals:

  • Data sizeworking-set no longer fits in RAM; backups, restores, replica provisioning, schema changes all slow. Vitess's 250 GB-per-shard guideline is the manageability threshold.
  • Write throughput — primary hits IOPS ceiling; replication lag is the leading symptom, IOPS saturation the lagging one.
  • Read throughput — read-replica scale-out works but accumulates structural cost (app-level read-write split logic + multiple connection strings + lag-visible staleness). Berquist: "earlier than we often think about sharding."

Substrate dependence

The ladder's rungs are not absolute — the transition points depend on the substrate. Direct-attached NVMe (PlanetScale Metal) extends rung 1 "into the several TB range"; network-attached storage (RDS, CloudSQL) hits the vertical-scaling ceiling earlier (Source: sources/2026-04-21-planetscale-guide-to-scaling-your-database-when-to-shard-mysql-and-postgres).

The commoditisation argument

Historically, rung 4 was a last resort because webscalers like Facebook, Twitter, and YouTube had to build their own sharding substrate (TAO, Gizzard, Vitess). Once Vitess became widely-adopted open source (2011-onwards), "sharding is no longer a last resort, and in fact, if adopted earlier, can help you avoid other larger application changes." The ladder shape hasn't changed, but rung 4's cost has collapsed — which argues for climbing to it sooner than the historical advice suggested (Source: sources/2026-04-21-planetscale-guide-to-scaling-your-database-when-to-shard-mysql-and-postgres).

Seen in

Last updated · 347 distilled / 1,201 read