CONCEPT Cited by 1 source

EBS IOPS burst bucket¶

Definition¶

An EBS IOPS burst bucket is a credit-bucket rate limiter on Amazon EBS volumes: while the volume runs below its configured IOPS cap, unused IOPS accrue as credit (up to a fixed maximum); when the workload bursts above the cap, the volume spends credit to temporarily exceed it; when the bucket empties, the volume falls back to the baseline rate until credit re-accrues.

This is the same token-bucket rate-limiter pattern used in many cloud rate-limiters, instantiated at the I/O-operations layer.

Dicken's framing¶

"EBS volumes also allow you to bank unused IOPS, up to a fixed limit. These stored IOPS can be redeemed in the future, allowing the volume to burst up beyond the set IOPS limit for stretches of time. Once the bank is depleted, it will not be able to burst until more IOPS are accumulated."

(Source: sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding)

Who has a burst bucket on EBS¶

gp2 has the classic burst bucket: IOPS is 3 × GiB baseline, burst to 3,000 IOPS for short windows, credit accrues while the volume is idle.
gp3 is provisioned-IOPS rather than bursting — the provisioned IOPS rate is delivered continuously, without a credit-bucket mechanic.
io1 / io2 are strictly provisioned with no burst mechanic.

The burst-bucket model is most visible on gp2 volumes and on small-size volumes (where the 3× GiB baseline is below typical workload demand).

How the bucket fills and drains¶

Simplified mechanics (exact constants are AWS-specific and documented in the EBS manual):

credit_per_second_while_below_baseline = baseline_IOPS - current_IOPS
credit_spent_per_second_during_burst   = current_IOPS  - baseline_IOPS
max_credit                              = fixed bucket size (cannot accrue indefinitely)

When the volume's current load is below the baseline, unused IOPS fill the bucket at baseline - current rate. When the load exceeds baseline, credit drains at current - baseline. The bucket has a fixed maximum — sustained idle time doesn't produce unlimited burst headroom.

Why it exists¶

Two structural reasons:

Customer economics: many workloads are bursty (batch jobs, web traffic spikes, morning-login surges). Paying for provisioned IOPS at peak load 24/7 is wasteful; the burst bucket lets a cheaper volume tier serve those workloads as long as the peaks are infrequent.
Fleet economics: the backing storage fleet has headroom-at-aggregate even when individual volumes burst — statistical multiplexing means the probability of all volumes simultaneously bursting is low. The burst bucket is the accounting mechanism that lets AWS oversubscribe the fleet without violating per-volume SLAs.

Production gotcha: steady-state workloads can't rely on burst¶

If your workload runs sustained at the burst rate rather than the baseline, the bucket drains and you fall back to baseline permanently. The diagnostic symptom is: "the database was fast for the first hour, then slow." First hour = bucket-full burst; slow hour = bucket-empty baseline.

This is the canonical failure mode of under-provisioning EBS on gp2: the workload's sustained demand is above baseline, you only see the burst rate during bucket-fill periods, and the database runs at the lower baseline the rest of the time.

Why it's a database problem¶

Databases are steady-state workloads. They don't benefit from bursty IOPS the way a daily batch job does — OLTP traffic is continuous; the working set must be served on every page fault; writes fsync at every commit. A burst bucket helps with traffic spikes (morning user login storm) but does not help with the sustained background load.

Dicken's implicit framing: if you're running a production OLTP database on a bursting gp2 volume and your IOPS demand is above the baseline, you should either:

Upgrade to provisioned-IOPS gp3/io1/io2 (steady-state guarantee), or
shard horizontally so each shard's steady-state demand fits under the baseline.

Relationship to other rate-limit patterns¶

The EBS burst bucket is a specific instance of the token-bucket rate-limit pattern:

Token-bucket in HTTP APIs (e.g. AWS API Gateway, rate-limited REST endpoints) — same mechanic at the request layer.
Leaky-bucket in network QoS — different dynamics (constant drain) but similar burst-over-baseline accounting.
CPU credit bursting on t2/t3 EC2 instances — same bucket mechanism applied to CPU time rather than IOPS.

Seen in¶

sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding — Ben Dicken canonicalises the burst-bucket mechanic in the context of EBS ("allow you to bank unused IOPS, up to a fixed limit" + "once the bank is depleted, it will not be able to burst until more IOPS are accumulated"). The post presents it as the first of three EBS IOPS considerations (alongside the sequential vs random cost difference and the throughput cap being independent of IOPS) — all three feed the argument that databases on unsharded EBS hit cost cliffs, which sharding dissolves.