CONCEPT Cited by 1 source
Linear vs super-linear cost scaling¶
Definition¶
Linear cost scaling: when workload demand grows by factor N, cloud cost also grows by factor N.
Super-linear cost scaling: when workload demand grows by factor N, cost grows by more than N — commonly because crossing a capacity threshold forces a regime-shift to a more expensive product tier.
Cloud-provider pricing pages are full of super-linear cliffs: you can buy 3,000 IOPS cheaply on gp3, but breaking 16,000 IOPS forces you onto io2, where every IOPS suddenly costs more — not just the IOPS above the gp3 ceiling. The cost curve isn't a gradient; it's a staircase with steep risers.
Dicken's canonical framing¶
Ben Dicken (sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding) uses a worked comparison to canonicalise super-linearity:
- Small database (1 primary + 2 replicas, 8 vCPU, 32 GB RAM, 500 GB): $1,649 / $1,749 / $2,136 / $1,741 / $3,369 per month across RDS/PlanetScale/Aurora variants — all within ~2× of each other.
- 8× scaled database (64 vCPU, 256 GB RAM, 4 TB, 24,000 IOPS, 1,000 MiB/s peak throughput per primary): $20,520 / $24,197 per month on RDS configurations. A 11-13× cost multiplier for an 8× workload.
Dicken verbatim:
"Notice that for the RDS instances, the pricing did not grow linearly. The cost jumped by 11-13x."
Why the jump? Crossing the gp3 IOPS ceiling (16,000) forces the architect onto io1 / io2 provisioned-IOPS volumes. The 24,000 IOPS requirement is barely beyond gp3's ceiling, but the regime-shift doubles the IOPS cost per unit and adds a per-IOP-month premium on top. It's not that 24k IOPS is 11× more engineering than 3k IOPS — it's that 24k IOPS is on a different tier of product.
The linearity-by-sharding alternative¶
Horizontal sharding preserves linear cost scaling because each shard's per-volume demand stays below the regime-shift threshold. Dicken's 8-shard PlanetScale variant:
- 8 × PS-400 shards + 1 unsharded reference shard = $13,992/mo
- Each shard handles 1/8 the data and 1/8 the IOPS demand.
- Each shard fits comfortably on a
gp3volume at default IOPS — no premium-IOPS tier needed. - Cost = 8 × (single-shard cost) ≈ 8 × $1,749 = linear multiplier.
Dicken's structural claim:
"Notice that for the RDS instances, the pricing did not grow linearly. The cost jumped by 11-13x. For Aurora, the cost grew linearly, but the base costs were high to begin with. With sharding, we are afforded a linear growth rate and acceptable costs. This is a much better option for long term scalability."
The two ways to achieve linearity:
- Pay for linearity up front — Aurora I/O-Optimized's pricing is linear but has a high floor (IOPS bundled into the instance price, no gp3/io2 tier transition).
- Engineer linearity via sharding — N shards × per-shard cost, where each shard stays in the cheap product tier.
The three canonical pricing-cliff drivers (cloud databases)¶
- IOPS cliff: cross the
gp3ceiling →io1/io2premium tier. - Throughput cliff: cross the
gp31,000 MiB/s ceiling → requiresio2and still saturates at 4,000 MiB/s. - Instance-type cliff: certain EC2 instance classes (d-class with attached NVMe) are the only classes that support 3-node multi-AZ RDS — choosing the wrong instance type locks you into an artificially narrow pricing tier.
All three are regime-shifts, not gradients — which is why cloud-database cost curves are staircases.
Why super-linear pricing exists¶
Several reasons, roughly in order of structural force:
- Underlying hardware economics: premium-IOPS volumes use different SSD technology, tighter fleet isolation, and more network bandwidth per volume. The unit cost is genuinely higher.
- Fleet-sharing economics: cheap volumes tolerate noisy-neighbour behaviour; expensive volumes guarantee it won't happen. The isolation costs capacity.
- Willingness-to-pay segmentation: a customer who needs 24k IOPS for a single-primary database has fewer alternatives than a customer at 3k IOPS. Pricing per-IOPS at a premium captures the customer who truly needs it.
- Capital-recovery on specialty hardware: cloud providers amortise dedicated SSD fleets over a smaller customer base; per-unit cost reflects that.
Architectural implications¶
- Design the workload to stay on the cheap tier. Caching, indexes, query tuning, and working-set management are all cost-engineering tools, not just performance-engineering tools.
- Shard before the ceiling, not after. The 11-13× jump is a cliff; sharding pre-cliff is cheaper than sharding post-cliff and paying the premium-tier bill during the migration. See patterns/exhaust-simpler-scaling-first for the decision framework.
- Direct-attached NVMe dissolves the IOPS cliff. Substrate-level shift — see patterns/direct-attached-nvme-with-replication + Metal. The IOPS cliff exists because EBS imposes administrative caps; local NVMe has no such caps, so the super-linear regime shift never happens.
- Aurora's linearity is a pricing choice, not a structural property. Aurora bundles IOPS into the instance price, so there's no IOPS-tier transition — but the instance price itself is higher, so you pay for the linearity up front.
Seen in¶
- sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding — Ben Dicken canonicalises the super-linear multiplier (8× workload → 11-13× cost on RDS with unsharded
io1) and the linearity-via-sharding alternative (8× workload → 8× cost on 8-shard PlanetScale). The underlying driver is thegp3→io1/io2regime shift, which sharding avoids by keeping each shard below the 16k IOPS gp3 ceiling.
Related¶
- concepts/horizontal-sharding
- concepts/iops-throttle-network-storage
- concepts/write-throughput-ceiling
- concepts/scaling-ladder
- concepts/price-performance-ratio
- patterns/sharding-as-iops-scaling
- patterns/exhaust-simpler-scaling-first
- patterns/direct-attached-nvme-with-replication
- systems/aws-ebs
- systems/aws-rds
- systems/planetscale-metal