CONCEPT

Postgres queue table¶

Definition¶

A Postgres queue table is a table used as a job queue: workers poll for pending rows, claim one, perform the associated work, and delete (or mark done) the row on commit. The canonical shape:

CREATE TABLE jobs (
  id BIGSERIAL PRIMARY KEY,
  run_at TIMESTAMPTZ DEFAULT now(),
  status TEXT DEFAULT 'pending',
  payload JSONB
);
CREATE INDEX idx_jobs_fetch ON jobs (run_at) WHERE status = 'pending';

Workers claim with SELECT ... FOR UPDATE SKIP LOCKED and either delete or update the row on success.

Why teams do this ("just use Postgres")¶

Keeping the job queue in the same database as application data means the job row and the application mutation commit together. If the outer transaction rolls back, the job row rolls back with it — no distributed-transaction coordination needed. This is the property external queues (SQS, Kafka, RabbitMQ) cannot easily give you.

The unique workload shape¶

A queue table has a shape no other workload shares:

Most rows are transient — inserted, read once, deleted.
Table size stays roughly constant while cumulative throughput is enormous.
Every successful job produces one dead tuple.
The hot path is an index scan ordered by run_at which accumulates B-tree references to dead tuples.

This high-write-churn shape is what makes queue tables specifically sensitive to MVCC-horizon pinning.

The failure mode¶

If dead tuples accumulate faster than autovacuum reclaims them:

Index scans traverse more dead leaf entries per claim.
Lock times on the claim query rise from baseline-ms to tens or hundreds of ms.
Queue depth starts growing even though individual jobs still complete.
No individual query or transaction tripping a timeout.
The dead-tuple count, not queue depth, is the leading indicator.

PlanetScale's Griggs measured baseline 2–3 ms lock time rising to 180 ms at 24k dead tuples on modern Postgres 18, and 300+ ms / 155k backlog / 383k dead tuples under stress with overlapping analytics (Source: ).

What helps vs what doesn't¶

Intervention	Effect
`SELECT ... FOR UPDATE SKIP LOCKED`	Eliminates claim-serialization; doesn't fix dead-tuple scan cost
Batch claims (10 jobs/txn)	Amortizes scan cost per commit; same dead-tuple production rate
Partial index `WHERE status = 'pending'`	Keeps the hot index small; doesn't stop B-tree bloat from deletes
Aggressive autovacuum on the queue table	Helps when horizon is free; useless when horizon is pinned
Short worker transactions	Avoids workers pinning horizon; doesn't stop other workloads pinning it
`statement_timeout` / `transaction_timeout`	Kills runaway queries; cannot limit overlapping-query concurrency
patterns/workload-class-resource-budget	Caps concurrent low-priority workers, lets vacuum run between them — the structural fix

When to use a dedicated queue system instead¶

The "just use Postgres" thesis is defensible if the queue table lives in a single-workload Postgres cluster or mixed-workload concurrency is actively managed. If long-running analytics, read-replicas with hot-standby feedback, or long-lived application transactions share the database, dead-tuple accumulation can become a normal operating condition and a dedicated queue system (SQS/Kafka/RabbitMQ/Oban on isolated Postgres) becomes the simpler answer.

Seen in¶

— Canonical wiki definition of the shape + failure mode. Griggs reproduces Brandur's 2015 que-degradation-test on Postgres 18 and shows degradation curves are "almost identical" despite a decade of Postgres improvements — the underlying mechanism is unchanged, the threshold is higher.