Skip to content

PATTERN Cited by 2 sources

Exhaust simpler scaling first

Pattern

When a database starts hitting capacity limits, climb the scaling ladder one rung at a time — vertical scaling → read-replicas → vertical sharding → horizontal sharding — rather than jumping directly to the most complex rung. Each rung preserves more of the existing application surface than the next; each adds real operational complexity. Don't pay the complexity of rung N+1 until a specific trigger signal makes rung N insufficient.

"Before you begin to think about sharding, let's make sure you've first exhausted some of the other options." ()

"Typically, you scale your entire database vertically, then shard vertically, then use horizontal sharding for ultimate scalability of large workloads." (Dicken)

When to apply

Every capacity-constrained database workload faces this decision. The pattern applies unless a specific reason rules it out:

  • Default: climb rungs sequentially, stopping at whichever rung clears capacity.
  • Exception — known future triggers: if you can already see data-size + write-throughput + read-throughput triggers all arriving within 12–18 months, skipping rungs (e.g. directly to horizontal sharding) may be cheaper than iterating through each rung — the .
  • Exception — exhausted runway: if the lower rungs simply won't scale far enough (e.g. the hot table is already at Vitess's 250 GB per-shard guideline on the single primary), skip to horizontal sharding.

Mechanics

  1. Name the trigger signal. What's failing? Working set too big for RAM? Replication lag growing? IOPS throttle hit? Read-replica pool saturated? Each signal maps to a specific rung.
  2. Quantify the headroom. How far can the current rung take you? A 2× larger instance buys 2× RAM; a new read replica adds 1 replica's worth of read capacity. Vertical sharding moves one hot table off; horizontal sharding multiplies write capacity by N.
  3. Estimate the lead time for the next rung. If you're 3 months from the ceiling and the next rung takes 6 months to implement, start now. If you're 12 months out, take the rung that buys the most time per unit engineering effort.
  4. Climb incrementally. Each rung should leave the application in a state where the next rung is still feasible — avoid decisions that preclude future rungs (e.g. a schema choice that makes horizontal sharding impossible later).

Why it works

  • Application-surface preservation. Lower rungs change less of the app. Vertical scaling changes nothing at the SQL layer; read-write splitting adds one connection-string decision; vertical sharding adds per-cluster routing; horizontal sharding adds shard-key semantics to every query on the sharded table. Climb rungs only when forced.
  • Reversibility. Vertical scaling is trivially reversible (downsize); read-replicas are reversible (remove). Horizontal sharding is structurally irreversible (changing shard key is a full resharding operation — "costly and disruptive" per ).
  • Operational-complexity amortisation. A horizontally sharded cluster requires Vitess-level operational expertise; a vertically-scaled primary with read replicas does not. Paying for that expertise only when forced is sound capital allocation.

Exceptions / refinements

  • The commoditisation argument (): "Sharding is no longer a last resort, and in fact, if adopted earlier, can help you avoid other larger application changes." Once the cost of the top rung has collapsed (thanks to mature Vitess / managed offerings), "exhaust everything first" is nuanced by the observation that you'll pay routing-logic cost anyway. Paying it once via horizontal sharding may be cheaper than paying it twice (read-write split today + shard-key routing later).
  • Substrate choice changes rung heights. PlanetScale Metal's direct-attached NVMe extends rung 1 "into the several TB range" vs RDS/CloudSQL. Sometimes the right move is not "climb to the next rung" but "stay on rung 1 with a better substrate."

Seen in

  • sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding — Ben Dicken (2024-08-19) canonicalises the IOPS-cost cliff as a specific trigger that makes simpler rungs of the scaling ladder untenable. Paying for provisioned IOPS (gp3 extra IOPS, or upgrading to io1 / io2) preserves the single-primary regime at a super-linear cost — 8× workload → 11-13× monthly bill on RDS with io1. Sharding becomes the linearising move once the cost cliff is in view: "Sharding is an excellent technique to run huge databases efficiently, without needing to pay an EBS premium." Each shard's demand stays under the cheap-tier threshold — a linear N × shard_cost instead of the regime-shifted premium-tier bill. Companion to Berquist's commoditisation-of-sharding argument; both post the same direction (don't over-delay horizontal sharding) via different triggers (Berquist = long-term runway cost of replatforming; Dicken = near-term IOPS-tier pricing cliff).

  • — Berquist's canonical decision-framework; the pattern's underlying three-trigger framing (data size / write throughput / read throughput) plus the commoditisation-argument refinement.

  • sources/2026-04-21-planetscale-dealing-with-large-tables — Dicken's ladder with concrete Vitess mechanics for each rung (MoveTables for vertical, Reshard for horizontal).
Last updated · 542 distilled / 1,571 read