Skip to content

PATTERN Cited by 2 sources

Isolate fastest-growing table to own keyspace

Pattern

When one table in a monolithic database grows much faster than the others, move it to its own dedicated cluster (keyspace) and downsize the cluster holding the remaining tables. This is the canonical vertical sharding shape — "moving large tables onto separate servers" (Dicken) — applied to a single dominant hot table rather than to logical table groups.

When to apply

  • Non-uniform growth: one table's row count / size / write rate is an order of magnitude higher than the others. "Only a small subset of these tables grow very large, into the hundreds of gigabytes or terabytes for a single table" (Dicken).
  • Vertical scaling has become expensive: the monolithic cluster is over-provisioned to handle the hot table, but the cold tables don't need that capacity.
  • Not yet ready for horizontal sharding: the hot table fits on one large instance, just not co-hosted with the cold tables.

Mechanics (Vitess)

Canonical worked example from sources/2026-04-21-planetscale-dealing-with-large-tables — the MuscleMaker workout-tracking schema:

  • Before: one musclemaker keyspace with user, exercise, exercise_log (the hot table) at 32 vCPU / 64 GB RAM / 4 TB.
  • After:
  • musclemaker_log keyspace with only exercise_log at 32 vCPU / 64 GB RAM / 4 TB — the hot table gets the original sizing.
  • musclemaker_main keyspace with user + exercise, downsized to 16 vCPU / 32 GB RAM — the cold tables don't need the big box.

Commands:

vtctldclient MoveTables --workflow musclemakerflow create \
  --target-keyspace musclemaker_log \
  --source-keyspace musclemaker \
  --tables "exercise_log"

# ...hours later, after copy completes...

vtctldclient MoveTables --workflow musclemakerflow \
  --target-keyspace musclemaker_log switchtraffic

The copy phase takes hours for multi-TB tables; production traffic routes to the source keyspace throughout. Cutover via switchtraffic is fast and atomic.

Why it works

  • Downsizes the cold cluster. Most of the cost savings come from right-sizing the non-hot cluster, not from upsizing the hot one.
  • Preserves relational semantics inside each keyspace. user/exercise JOINs still work in musclemaker_main; exercise_log is now isolated but doesn't usually JOIN to exercise at query time.
  • Defers horizontal sharding. The hot table gets its own vertical-scaling ceiling to climb before horizontal sharding is needed.
  • Reversible-ish. Unlike horizontal sharding, vertical sharding can be undone by MoveTables in the reverse direction — though in practice it's rare to want to.

The Vitess JOIN-across-keyspaces trade-off

"Without a framework like Vitess, you would be unable to perform JOINs between tables that now live on different servers." (Berquist). Vitess's planner executes cross-keyspace JOINs at the VTGate proxy tier, making the split invisible to the application. Plain MySQL + app-level sharding would require rewriting those JOINs as application-level composes.

When not to apply

  • Uniform growth — no one table dominates. Then vertical sharding at the table-group level (users domain vs notifications domain) is the right shape, not single-table isolation.
  • Hot table is already at single-instance ceiling — no point moving it to its own cluster if that cluster is immediately going to need horizontal sharding. Skip to horizontal.
  • JOINs to the hot table dominate — moving the hot table breaks hot-path JOINs unless Vitess (or similar) is already in place to reconstitute them.

Seen in

Last updated · 347 distilled / 1,201 read