Skip to content

CONCEPT Cited by 4 sources

Connection pool exhaustion

Definition

A connection pool is a fixed-size set of pre-established database connections shared across application workers. An application worker takes a connection from the pool for the duration of a query (or transaction), then returns it.

Connection pool exhaustion is the state where every connection in the pool is in use; new requesters either:

  • Wait for a connection to be returned (bounded-queue behaviour).
  • Get rejected immediately (fail-fast behaviour).
  • Create a new connection outside the pool (uncommon; defeats the pool's purpose).

Why it's a high-quality throttling signal

Among the metrics Shlomi Noach walks through in Part 1 of Anatomy of a Throttler (sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-1), pool exhaustion is the one signal with a natural threshold:

"Who decides the size of the pool in the first place? If someone picked a number such as 50 or 100, isn't that number just artificial? It may well be, but pool size was likely chosen for some good reason(s). It is perhaps derived from some database configuration, which is itself derived from some hardware limitation. And while the choice of metric could possibly change arbitrarily, it is still sensible, as far as throttling goes, to push back when the pool is exhausted. The throttler thereby relies on the greater system configuration and does not introduce any new artificial thresholds."

In other words: the operator already chose a pool size based on:

  • max_connections in the database
  • Memory per backend (InnoDB / PostgreSQL backend overhead)
  • Expected concurrency profile
  • Connection-establishment cost / TLS handshake cost

A throttler that pushes back when the pool is 100% used inherits the existing system configuration as its threshold. It does not need to ask the operator to pick a new number.

Contrast with intermediate levels of utilisation

Pool exhaustion (100%) is a strong signal. Pool utilisation (60% vs 80%) is a much weaker one:

"An exhausted pool is a strong indication of excessive load, while the difference between a 60% and an 80% used pool is not as clear an indication."

This is why pool exhaustion is typically a binary throttling signal — "reject when full" — rather than a continuous one. Continuous levels conflate transient burstiness with sustained load in ways that are hard to separate without knowing workload shape.

Limits of the signal

  • Pool connections span transactions, not queries. A connection can be held across multiple queries in a transaction, across multiple transactions, and across application logic between them. Pool exhaustion therefore reports on held connections, not on actively executing ones — a subtle distinction from concepts/threads-running-mysql.
  • Per-pool, not per-database. Modern applications may have many pools (per-service, per-region, per-tenant); pool exhaustion in one does not necessarily indicate database-level distress.
  • Pooler-level pools (PgBouncer, ProxySQL, VTGate) introduce a second tier of exhaustion — the database can be unexhausted while the pooler is full, and vice versa.

Relationship to queueing theory

Pool exhaustion is the queue-full state of a bounded queue: the pool has capacity N, arrivals exceed the service rate so all N slots are occupied, and new arrivals are rejected or stall. See concepts/queueing-theory and concepts/backpressure for the broader framing.

Seen in

  • canonical benchmarked anchor for pool-ceiling scaling via a proxy-tier architecture. Liz van Dijk (PlanetScale, 2022-11-01) sustains 1,000,000 concurrent open connections against PlanetScale via a two-tier pool (VTTablet in-cluster + Global Routing Infrastructure at edge) — 62.5× above RDS MySQL's 16k exhaustion ceiling. Empirical demonstration that pool-ceiling exhaustion is a property of standalone-DB memory architecture (see [[concepts/max-connections- ceiling]], concepts/memory-overcommit-risk), not an inherent limit on concurrent-client counts. The two-tier pool architecture sidesteps exhaustion by enforcing the memory budget at a proxy tier while accepting many more client connections upstream of it.

  • sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-1 — canonical framing of pool exhaustion as the one throttling-signal metric with a natural, system-configuration- derived threshold. Noach lands on it as the positive counter- example to threads_running and load average, both of which lack a stable threshold.

  • — canonical wiki instance of pool-ceiling-as-scale-out-wall. Jarod Reyes (PlanetScale, 2021-09-30): "While RDS limits connections to 16,000, PlanetScale has been designed to scale to nearly limitless database connections per database." The 16k ceiling on RDS MySQL is a canonical instance of the failure-when- full mode — once reached, customers either manually tune up max_connections (and pay the per-backend memory cost) or interpose a customer-side connection pool. PlanetScale / Vitess's structural answer is the combination of (a) VTGate as a connection-multiplexing proxy tier absorbing client-fleet churn and (b) Vitess query consolidation capping upstream pool pressure at O(unique queries in flight) rather than O(total callers). Canonical framing for why connection-pool exhaustion is both a throttling signal (Noach) and a scale-out ceiling (Reyes) — different lenses on the same natural-threshold phenomenon.

  • shares-nothing PHP as an aggravating factor on the client side. Matthieu Napoli (2023-05-03) canonicalises the failure mode: every AWS Lambda invocation of a shares-nothing PHP app spins up a fresh PHP process image, opens a fresh TLS connection to the database, authenticates, and tears it all down at the end of the request. Aggregate client connection count scales with invocation count, not with application-side pool size — a 1000-concurrent Lambda fan-out is 1000 fresh DB connections regardless of each worker's configured pool size. See concepts/shared-nothing-php-request-model for the architectural framing and patterns/persistent-process-for-serverless-php-db-connections for the client-side fix pattern (Laravel Octane with OCTANE_PERSIST_DATABASE_SESSIONS=1 within a Lambda execution context). Complementary to the server-side proxy-tier pool fix.

Cross-region latency as a trigger

When an API is deployed active-active across regions but its database primary exists in only one region, the remote API instance incurs per-query cross-region RTT. This holds connections open much longer (e.g., 3 seconds vs. 10ms), rapidly exhausting the client-side pool. Cloudflare observed exactly this: their Amsterdam API instance exhausted its pool because every query to Portland Postgres added ~50ms+ per hop, causing cascading timeouts and Kafka partition starvation (Source: sources/2026-06-12-cloudflare-scaling-security-insights).

Fix: switch to concepts/active-passive-failover — collocate the active API with the primary database.

Last updated · 542 distilled / 1,571 read