PLANETSCALE 2024-08-19

PlanetScale — Increase IOPS and throughput with sharding¶

Summary¶

Ben Dicken (PlanetScale, originally 2024-08-19, re-fetched 2026-04-21) publishes a pricing-pedagogical post canonicalising IOPS and throughput as first-class database-sizing parameters alongside vCPU / RAM / storage, and positions horizontal sharding as the architectural lever that makes I/O cost scale linearly rather than super-linearly as a database grows. A small-database configuration ($1,649-$3,370/mo across RDS / Aurora / PlanetScale) is used as the baseline; at 8× scale the non-sharded configurations jump by 11-13× (to $20k-$24k/mo on RDS with io1 provisioned-IOPS volumes), whereas a sharded PlanetScale (8 PS-400 shards + 1 unsharded reference shard) comes in at $13,992/mo — a linear 8× scale-out on the unsharded baseline rather than a super-linear jump. The load-bearing claim: "Sharding is an excellent technique to run huge databases efficiently, without needing to pay an EBS premium. In a sharded database, we spread out our data across many primaries. This also means that our IO and throughput requirements are distributed across these instances, allowing each to use a more affordable gp2 or gp3 EBS volume."

The post opens with a Note that — since the August-2024 publication — PlanetScale has released Metal (March 2025): "Metal databases give you unlimited IOPS and ultra low latency reads and writes." The IOPS-cost-cliff the post is designed to sell sharding as the mitigation for is, in the PlanetScale Metal architecture, structurally mitigated by local-NVMe substrate — sharding remains useful for horizontal scale but is no longer the only lever against IOPS cost.

Key takeaways¶

IOPS is storage-system-specific, not a universal unit. On AWS gp3 EBS volumes, "a single operation is measured as a one 64 KiB disk read or write." Other EBS classes (gp2, io1, io2, st1, sc1) have their own per-IOP semantics. The naive formula throughput = IOPS × 64 KiB holds only for purely sequential workloads — for random reads "each read counts as a full IOP, even if it is less than 64k", so a 4k random read on EBS still burns a full 64 KiB IOP. "Some think that once you move to SSDs, your sequential vs random read patterns no longer matter. However, this applies to workloads on both HDDs and SSDs on EBS." Canonical new sequential vs random I/O concept. (Source: this article, explicit quote.)
Throughput is capped independently of IOPS. Each gp3 volume has a default throughput limit of 125 MiB/s on top of the 3,000 IOPS default. Even at perfect sequential efficiency (3,000 IOPS × 64 KiB = 192 MiB/s), "due to the default throughput limit of 125 MiB/s, we cannot actually reach this unless we purchase more throughput." Canonical new throughput vs IOPS concept: IOPS is operations-count-per-second, throughput is bytes-moved-per-second; they are related but must be provisioned separately.
EBS "burst bucket" lets you bank unused IOPS. "EBS volumes also allow you to bank unused IOPS, up to a fixed limit. These stored IOPS can be redeemed in the future, allowing the volume to burst up beyond the set IOPS limit for stretches of time. Once the bank is depleted, it will not be able to burst until more IOPS are accumulated." Canonical new EBS IOPS burst bucket concept: a volume with a sustained load below its cap accrues credit, which spends down during bursts; once exhausted, the volume falls back to the baseline rate. First-order consequence: steady-state databases cannot rely on burst. Complements the already-canonical IOPS throttle on network-attached storage.
gp3 default: 3,000 IOPS / 125 MiB/s, max 16,000 IOPS / 1,000 MiB/s. io2 max: 256,000 IOPS / 4,000 MiB/s. Quoted verbatim: "Whereas gp3 has a max of 16000 IOPS and 1000 MiB/s, io2 has a max of 256,000 IOPS and 4000 MiB/s." The architectural choice for high-IOPS unsharded databases is "Upgrade to AWS's provisioned IOPS SSD volume types such as io1 or io2. These volumes are significantly more expensive but also have higher maximum IOPS and throughput." These numbers are the canonical wiki reference datum for EBS volume-type ceilings on this page.
Small-database comparison (1 primary + 2 replicas, 8 vCPU / 32 GB / 500 GB, us-east-1, August 2024): PlanetScale PS-400 $1,749 / RDS db.m6id.2xlarge (d-class with attached NVMe) $2,136.20 / RDS db.m6i.2xlarge (non-d-class, theoretical — RDS doesn't support 3-node multi-AZ non-d) $1,649.94 / Aurora I/O Optimized (half vCPU / same RAM) $1,741.14 / Aurora I/O Optimized (same vCPU / double RAM) $3,369.78. Aurora is the outlier; the rest cluster. "These prices are assuming that the default IOPS values will be sufficient to run our database."
8× scale comparison (64 vCPU / 256 GB / 4 TB / 24,000 IOPS / 1,000 MiB/s peak throughput per primary): RDS db.m6id.16xlarge with io1 provisioned IOPS = $24,196.56/mo / RDS db.m6i.16xlarge (theoretical non-d-class) with io1 = $20,519.52/mo / Aurora I/O Optimized = (high base) / PlanetScale sharded (8 × PS-400 + 1 PS-400 unsharded reference shard) = $13,992/mo. "Notice that for the RDS instances, the pricing did not grow linearly. The cost jumped by 11-13x." Canonical new linear vs super-linear cost scaling concept — the pricing jump is non-linear because the architectural regime shifted (need to move from gp3 into io1/io2 provisioned-IOPS tiers), not because the 8× workload is 11-13× harder.
Four architectural options to meet 8× I/O demand: (a) gp3 + pay for additional IOPS + pay for additional throughput (linear $-scaling, ceiling at 16k IOPS / 1 GiB/s per volume); (b) upgrade to io1 / io2 provisioned-IOPS SSDs (significantly more expensive but higher ceilings — 256k IOPS / 4 GiB/s on io2); (c) 4-volume striping of gp3 (RAID-0 across four gp3 volumes = 4× per-volume caps); (d) horizontal sharding — each shard has 1/N the IOPS/throughput demand, stays on affordable gp2 / gp3. Canonical new sharding-as-IOPS-scaling pattern. "Provisioned IOPS volumes may be your only option if you are running a massive database with a single primary server."
Sharding's linearity property on I/O cost: "In this situation, we do not need to pay extra for additional IOPS or dedicated io1 infrastructure. The IOPS and throughput demand is spread evenly across the 8 shards, allowing us to stick with a more affordable class of EBS volumes … With sharding, we are afforded a linear growth rate and acceptable costs. This is a much better option for long term scalability." Structural reason: each shard sees 1/N the data → 1/N the working-set → 1/N the IOPS demand → stays below the per-volume default cap that triggers the premium-tier price cliff.

Operational numbers¶

Parameter	`gp3` default	`gp3` max	`io2` max
IOPS	3,000	16,000	256,000
Throughput (MiB/s)	125	1,000	4,000

gp3 IOP size: one 64 KiB disk read or write (sequential regime). Random reads count as a full IOP regardless of actual bytes requested (a 4 KiB random read = one full 64 KiB IOP).
Theoretical max sequential throughput at 3,000 IOPS: 3,000 × 64 KiB = 192 MiB/s — but capped at 125 MiB/s default throughput limit.
Small-database baseline (1P + 2R, 8 vCPU / 32 GB / 500 GB, us-east-1, August 2024): PlanetScale PS-400 = $1,749/mo; RDS db.m6id.2xlarge = $2,136.20/mo; Aurora I/O Optimized (half vCPU) = $1,741.14/mo; Aurora I/O Optimized (same vCPU, double RAM) = $3,369.78/mo.
8× scale (64 vCPU / 256 GB / 4 TB / 24,000 IOPS / 1,000 MiB/s throughput per primary): RDS db.m6id.16xlarge + io1 = $24,196.56/mo; RDS db.m6i.16xlarge + io1 = $20,519.52/mo; PlanetScale 8-shard = $13,992/mo.
RDS super-linear multiplier: 8× workload → 11-13× cost on unsharded RDS configurations.
PlanetScale linear multiplier: 8× workload → 8× cost on sharded PS-400 (each shard is identical in price to the baseline).

Systems / concepts / patterns canonicalised¶

Systems: Amazon EBS (gp2 / gp3 / io1 / io2 / st1 / sc1 volume-type taxonomy + per-class IOPS ceilings + default-throughput caps); Amazon RDS (d-class vs non-d-class 3-node multi-AZ restriction; db.m6id vs db.m6i); Aurora I/O Optimized (pricing positioning — higher base, IOPS included, scales linearly but from a high floor); PlanetScale (PS-400 as reference unit); PlanetScale Metal (referenced as post-publication structural mitigation of the IOPS-cost-cliff the article is selling sharding against); Vitess (sharding substrate under PlanetScale).
Concepts: IOPS (canonical new concept — single-operation semantics, storage-system-specific); throughput vs IOPS (canonical new — independent capacity dimensions); sequential vs random I/O on SSD/EBS (canonical new — "this applies to workloads on both HDDs and SSDs on EBS"); EBS IOPS burst bucket (canonical new — banking / draining model); linear vs super-linear cost scaling (canonical new — the regime-shift causes the 11-13× multiplier); extends IOPS throttle on network-attached storage with the throughput dimension + burst mechanics; extends horizontal sharding with the IOPS-dispersion motive alongside the data-size / write-throughput / read-throughput motives; extends write throughput ceiling with IOPS as the third saturation dimension.
Patterns: sharding as IOPS scaling (canonical new pattern — spread I/O demand across N shards, stay below per-volume premium-tier threshold); extends exhaust simpler scaling first with IOPS-cost-cliff as the specific trigger that makes simpler rungs untenable; complementary to direct-attached NVMe with replication (different architectural answer to the same IOPS-cost-cliff problem — PlanetScale's Metal tier is the post-publication structural fix).

Caveats¶

Pedagogy voice + vendor-marketing angle: Dicken's comparison has PlanetScale winning both the small and large database cost comparisons. The numbers are reproducible via AWS's public pricing page but the cost ratio is sensitive to workload shape (cached read-heavy workloads get less value from the extra IOPS than I/O-bound OLTP).
No measured per-shard IOPS/throughput distribution: the 8-shard linear-cost claim assumes even shard-key distribution. Real production workloads have hot shards (common in time-series or user-ID-sharded workloads), which means some shards need to upgrade to more provisioned IOPS even when the aggregate stays below 8× the baseline. The post doesn't quantify this.
No benchmark data: "Performance benchmarks are not included here. These four configurations would likely all have unique performance characteristics in production workloads." Explicit caveat in the body.
Sharding overhead elided: VTGate proxy latency tax, cross-shard query cost, scatter-gather fan-out, shard-key migration cost — all named in other wiki sources (sources/2026-04-21-planetscale-database-sharding, ) but absent here. This post frames sharding purely as a storage-cost lever; in practice sharding has compute-side costs too.
August 2024 pricing — snapshot, not tracking. AWS pricing revisions since then (both on EBS and on EC2 instance types) may shift the exact multipliers.
gp2 / gp3 / io1 / io2 naming is AWS-specific: the architectural pattern (network-attached storage with per-volume IOPS caps + premium provisioned-IOPS tier + tiered throughput limits) applies across clouds but the exact names and ceilings differ on GCP Persistent Disk, Azure Managed Disk, etc.
gp2 vs gp3 bursting framing: the post describes the burst-bucket mechanism generically but gp2's burst model (IOPS tied to volume size + credit-bucket for baselines below the min) differs from gp3's provisioned model. The post lumps them.
The Metal note is retrofitted: the article was originally published August 2024 before Metal launched (March 2025). The "check out Metal" note is a post-publication insert, not the original argument.