Skip to content

PATTERN Cited by 1 source

Workload-class resource budget

Problem

A database cluster runs multiple workload classes concurrently: high-priority transactional traffic, medium-priority user-facing analytics, low-priority scheduled reports, job-queue workers. The classes compete for the same resources (CPU, I/O, backends, worker processes) but have different priorities and different blast radii if they fall behind.

Per-query timeouts (statement_timeout, transaction_timeout) can kill runaway individuals but cannot limit how many queries from a class can run concurrently. Continuously overlapping short queries from a low-priority class defeat all timeout-based controls — no individual query hits a duration limit, yet the class as a whole keeps the MVCC horizon pinned, starves autovacuum, or monopolizes I/O.

Autovacuum tuning can't help either — it has nothing it can legally reclaim while the horizon is pinned.

Solution

Tag queries with a workload class (typically via query-string metadata or a session variable) and apply a resource budget to each class with at least one of three dials:

  1. Maximum concurrent workers / backends — percentage of max_worker_processes that class can consume at any instant. Primary lever for protecting the MVCC horizon: cap action=analytics to e.g. 1 concurrent worker and the horizon opens between queries.
  2. Server share + burst limit — fraction of CPU/I/O the class may consume, with a burst ceiling. Smooths spikes from a class that occasionally runs heavy queries.
  3. Per-query limit — seconds of full-server resource use per individual query, independent of wall-clock time.

Queries that exceed a class's budget are blocked and the caller is expected to retry later, not killed outright. The total work happens, just at a rate the higher-priority class can coexist with.

Callers must implement retry logic — throttling without retry just moves the failure from "database dies" to "queries fail."

Tagging mechanism

Typical implementation: each class tags queries with SQLCommenter metadata appended to the SQL — e.g.

/* action=analytics,team=growth */
SELECT ... FROM ...

The resource-budget engine parses the comment, maps action=analytics to a budget definition, and enforces the dials. Non-tagged queries default to the "unclassified" bucket (typically unconstrained).

Canonical instance: PlanetScale Traffic Control

systems/planetscale-traffic-control is the canonical wiki instance — part of the Insights extension on PlanetScale Postgres. All three dials (server share + burst, per-query limit, max concurrent workers) are configurable per-budget. SQLCommenter tags select the budget.

Measured outcome in Griggs's 800-jobs/sec stress test with 3 concurrent overlapping 120-sec analytics queries (Source: ):

Metric Disabled Enabled (analytics cap = 1)
Queue backlog 155,000 jobs 0 jobs
Lock time 300+ ms 2 ms
Dead tuples at end 383,000 0–23,000 cycling
Analytics queries 3 concurrent, overlapping 1 at a time, 2 retrying
VACUUM effectiveness Blocked (horizon pinned) Normal (windows between queries)
Outcome Death spiral Completely stable
  • patterns/budget-enforced-quota-throttle (Pinterest Piqama) applies the same idea at Kubernetes-resource layer: quota-managed consumers must stay within an assigned budget or throttle.
  • AWS RDS Performance Insights + AWS Proactive Insights surface the telemetry but don't enforce class-level concurrency caps.
  • Postgres ALTER USER ... CONNECTION LIMIT is the coarsest version at the role granularity.
  • pgbouncer pool-mode pool_size per database/user can approximate concurrency caps at the proxy tier.

Each is a narrower version of the general shape: tag work by class, budget resources per class, retry-block excess.

When it doesn't apply

  • Single-workload clusters — no classes to distinguish.
  • Low-churn databases — MVCC horizon is a non-issue so the primary motivator is absent.
  • Workloads where every query is equal-priority (all-or-nothing throughput is the objective).

Seen in

  • Third canonical wiki instance, the Go-implementation companion. Josh Brown (2026-04-02) canonicalises five composable tagging axes that each instantiate the workload-class-resource-budget pattern from a different layer: (1) service (Postgres username / application_name), (2) route (patterns/route-tagged-query-isolation via HTTP middleware), (3) deployment / canary (startup env var DEPLOYMENT_TAG), (4) subscription tier (patterns/tier-tagged-query-isolation via auth middleware), (5) workload ([[patterns/dedicated-application-name-per- workload]] via dedicated connection pool). All five axes are AND-composed at enforcement time ([[concepts/composable-tag- axes]]). Canonical caller-side completeness: Enforce-mode rejection surfaces as [[concepts/sqlstate-53000-traffic- control-error|SQLSTATE 53000 with [PGINSIGHTS] Traffic Control: prefix]], with role-dependent response (503 for non-critical; exponential-backoff retry 100ms → 200ms → 400ms for critical paths). Warn-mode observability comes via pgx/v5 OnNotice hook. The three posts together canonicalise workload-class-resource-budget at three altitudes: Griggs 2026-04-11 (mixed-workload contention motivation + MVCC- horizon protection), Dicken 2026-03-31 (user-perceived- priority + spike-survival framing), Brown 2026-04-02 (Go implementation + five composable axes + caller-side Enforce / Warn handling).

  • sources/2026-04-21-planetscale-graceful-degradation-in-postgresSecond canonical wiki instance, framed on a different axis. Ben Dicken canonicalises the user-perceived priority variant: tag queries with critical / important / best-effort priority and shed the lowest tier under spike. Same Traffic Control mechanism, same three dials, but the classification axis is user-facing feature importance rather than workload type — and the load-bearing use case is surviving a 10× viral-event spike rather than preventing mixed-workload MVCC-horizon pinning. Canonical three-tier budget recipe: critical = no server-share cap + 2-sec per-query max; important = 25% server share + moderate concurrency; best-effort = 20% server share + low concurrency

  • live-disable-able under spike. Also canonicalises warn → enforce as the budget-tuning lifecycle (observe over-budget counts before committing to enforcement). Paired with this post, the workload-class-resource-budget pattern has both coexistence-on-a-cluster and survive-a-spike framings — same primitives, two orthogonal framings. See patterns/shed-low-priority-under-load for the spike- survival pattern that specialises this one.

  • — Canonical wiki instance. Griggs argues that autovacuum tuning and Postgres-upstream timeouts are structurally insufficient for mixed-workload clusters because they target duration, not concurrency. Traffic Control is the pattern-instantiating mechanism, measured to convert a death-spiral to complete stability at 800 jobs/sec + overlapping analytics.

Last updated · 542 distilled / 1,571 read