Skip to content

SYSTEM Cited by 1 source

PlanetScale Traffic Control

What it is

Traffic Control is a feature of the PlanetScale Insights extension that enforces per-workload-class resource budgets on PlanetScale Postgres clusters. It is the canonical wiki instance of the patterns/workload-class-resource-budget pattern.

Queries are classified by SQLCommenter metadata tags appended to the SQL (e.g. /* action=analytics */). A Resource Budget defines limits for that class; queries exceeding the budget are blocked and expected to be retried by the caller.

Three budget dials

Dial Controls
Server share + burst limit Percentage of server resources + how quickly they can be consumed
Per-query limit Seconds of full-server usage a single query may consume
Maximum concurrent workers Percentage of max_worker_processes available to this class at any instant

The third dial is the load-bearing one for protecting the MVCC horizon: capping a low-priority class to 1 concurrent worker opens windows where autovacuum can actually run.

What problem it solves

Upstream Postgres offers statement_timeout (7.3+), idle_in_transaction_session_timeout (9.6+), and transaction_timeout (17.0+) — all of which target individual-query duration and cannot limit workload-class concurrency. Three continuously-overlapping 40-second analytics queries keep the MVCC horizon pinned without any individual query tripping a timeout; autovacuum sees a continuously-pinned horizon and can't reclaim dead tuples produced by other workloads (e.g. a queue table on the same cluster).

Traffic Control is the class-of-mechanism upstream Postgres doesn't have: limit how many queries of a class can be active at once, not how long any one query runs.

Measured effect

In PlanetScale's stress test (Source: sources/2026-04-11-planetscale-keeping-a-postgres-queue-healthy):

  • Workload: 800 jobs/sec producer + 3 concurrent action=analytics 120-second queries + 8 workers + 10 ms work time. 15-minute run on a PlanetScale cluster.
  • Traffic Control disabled: 155,000-job backlog, 300+ ms lock time, 383,000 dead tuples at end, VACUUM blocked. Death spiral.
  • Traffic Control enabled (analytics cap = 1 concurrent worker, 25% of max_worker_processes): 0 jobs backlog, 2 ms lock time, dead tuples cycling 0–23,000, VACUUM runs normally in the gaps, 15 analytics queries completed in 15 min. Completely stable.

The analytics reports still run to completion — just serialized instead of 3-way concurrent.

Caller requirements

Applications must implement retry logic for blocked queries. Traffic Control is not doing less work, it is smoothing the rate at which work is performed. Without caller retry, throttling converts "database dies" into "queries fail."

Availability

Exclusive to PlanetScale Postgres clusters — not available in upstream Postgres, managed AWS/GCP/Azure Postgres services, or self-hosted Postgres. Upstream-compatible approximations (pgbouncer pool-mode caps + application-side rate limiting) cover some of the ground but don't recognize SQLCommenter tags natively.

Seen in

Source

Last updated · 319 distilled / 1,201 read