Skip to content

CONCEPT Cited by 1 source

Linear vs super-linear cost scaling

Definition

Linear cost scaling: when workload demand grows by factor N, cloud cost also grows by factor N.

Super-linear cost scaling: when workload demand grows by factor N, cost grows by more than N — commonly because crossing a capacity threshold forces a regime-shift to a more expensive product tier.

Cloud-provider pricing pages are full of super-linear cliffs: you can buy 3,000 IOPS cheaply on gp3, but breaking 16,000 IOPS forces you onto io2, where every IOPS suddenly costs more — not just the IOPS above the gp3 ceiling. The cost curve isn't a gradient; it's a staircase with steep risers.

Dicken's canonical framing

Ben Dicken (sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding) uses a worked comparison to canonicalise super-linearity:

  • Small database (1 primary + 2 replicas, 8 vCPU, 32 GB RAM, 500 GB): $1,649 / $1,749 / $2,136 / $1,741 / $3,369 per month across RDS/PlanetScale/Aurora variants — all within ~2× of each other.
  • 8× scaled database (64 vCPU, 256 GB RAM, 4 TB, 24,000 IOPS, 1,000 MiB/s peak throughput per primary): $20,520 / $24,197 per month on RDS configurations. A 11-13× cost multiplier for an 8× workload.

Dicken verbatim:

"Notice that for the RDS instances, the pricing did not grow linearly. The cost jumped by 11-13x."

Why the jump? Crossing the gp3 IOPS ceiling (16,000) forces the architect onto io1 / io2 provisioned-IOPS volumes. The 24,000 IOPS requirement is barely beyond gp3's ceiling, but the regime-shift doubles the IOPS cost per unit and adds a per-IOP-month premium on top. It's not that 24k IOPS is 11× more engineering than 3k IOPS — it's that 24k IOPS is on a different tier of product.

The linearity-by-sharding alternative

Horizontal sharding preserves linear cost scaling because each shard's per-volume demand stays below the regime-shift threshold. Dicken's 8-shard PlanetScale variant:

  • 8 × PS-400 shards + 1 unsharded reference shard = $13,992/mo
  • Each shard handles 1/8 the data and 1/8 the IOPS demand.
  • Each shard fits comfortably on a gp3 volume at default IOPS — no premium-IOPS tier needed.
  • Cost = 8 × (single-shard cost) ≈ 8 × $1,749 = linear multiplier.

Dicken's structural claim:

"Notice that for the RDS instances, the pricing did not grow linearly. The cost jumped by 11-13x. For Aurora, the cost grew linearly, but the base costs were high to begin with. With sharding, we are afforded a linear growth rate and acceptable costs. This is a much better option for long term scalability."

The two ways to achieve linearity:

  1. Pay for linearity up front — Aurora I/O-Optimized's pricing is linear but has a high floor (IOPS bundled into the instance price, no gp3/io2 tier transition).
  2. Engineer linearity via sharding — N shards × per-shard cost, where each shard stays in the cheap product tier.

The three canonical pricing-cliff drivers (cloud databases)

  1. IOPS cliff: cross the gp3 ceiling → io1/io2 premium tier.
  2. Throughput cliff: cross the gp3 1,000 MiB/s ceiling → requires io2 and still saturates at 4,000 MiB/s.
  3. Instance-type cliff: certain EC2 instance classes (d-class with attached NVMe) are the only classes that support 3-node multi-AZ RDS — choosing the wrong instance type locks you into an artificially narrow pricing tier.

All three are regime-shifts, not gradients — which is why cloud-database cost curves are staircases.

Why super-linear pricing exists

Several reasons, roughly in order of structural force:

  • Underlying hardware economics: premium-IOPS volumes use different SSD technology, tighter fleet isolation, and more network bandwidth per volume. The unit cost is genuinely higher.
  • Fleet-sharing economics: cheap volumes tolerate noisy-neighbour behaviour; expensive volumes guarantee it won't happen. The isolation costs capacity.
  • Willingness-to-pay segmentation: a customer who needs 24k IOPS for a single-primary database has fewer alternatives than a customer at 3k IOPS. Pricing per-IOPS at a premium captures the customer who truly needs it.
  • Capital-recovery on specialty hardware: cloud providers amortise dedicated SSD fleets over a smaller customer base; per-unit cost reflects that.

Architectural implications

  • Design the workload to stay on the cheap tier. Caching, indexes, query tuning, and working-set management are all cost-engineering tools, not just performance-engineering tools.
  • Shard before the ceiling, not after. The 11-13× jump is a cliff; sharding pre-cliff is cheaper than sharding post-cliff and paying the premium-tier bill during the migration. See patterns/exhaust-simpler-scaling-first for the decision framework.
  • Direct-attached NVMe dissolves the IOPS cliff. Substrate-level shift — see patterns/direct-attached-nvme-with-replication + Metal. The IOPS cliff exists because EBS imposes administrative caps; local NVMe has no such caps, so the super-linear regime shift never happens.
  • Aurora's linearity is a pricing choice, not a structural property. Aurora bundles IOPS into the instance price, so there's no IOPS-tier transition — but the instance price itself is higher, so you pay for the linearity up front.

Seen in

  • sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding — Ben Dicken canonicalises the super-linear multiplier (8× workload → 11-13× cost on RDS with unsharded io1) and the linearity-via-sharding alternative (8× workload → 8× cost on 8-shard PlanetScale). The underlying driver is the gp3io1/io2 regime shift, which sharding avoids by keeping each shard below the 16k IOPS gp3 ceiling.
Last updated · 378 distilled / 1,213 read