PATTERN

Consolidate identical in-flight queries¶

Intent¶

When many concurrent callers ask the same read-only question against a backing store within a short window, issue the question upstream exactly once, and fan out the single result to all waiting callers. Cap upstream pressure at O(unique queries in flight) rather than O(total callers).

Problem¶

A backing database sees sudden correlated load on a single hot read — a viral row, a cache-invalidated-but-not-yet-repopulated key, a row the application fetches on every request. Without consolidation:

Each caller occupies an upstream connection for the duration of its own execution.
If the query is slow (cold buffer, competing workload, large scan), caller queues pile up as fast as traffic arrives.
Upstream connection pool drains. New queries — including unrelated ones — get blocked on the now-exhausted pool.
What started as a single slow read becomes a full-database stall. This is the cascading-outage failure mode of a thundering herd at the database-proxy altitude.

The problem is structural: each caller's execution cost is duplicated N times for the same computed result. The proxy tier sees N identical queries in flight and fans N upstream connections, even though one upstream execution would be sufficient.

Solution¶

Interpose a proxy tier between callers and the backing database that:

Parses each arriving query to derive a stable hash over the query text + bind parameters (normalized — whitespace, comment, case).
Maintains an in-flight map keyed by that hash. Each entry points at a single upstream execution with a list of waiting callers.
On arrival of a query with a hash already in the map, enqueue the new caller on the existing entry instead of issuing a new upstream query.
When the upstream execution returns, fan the result out to every waiting caller.
Remove the map entry once all callers have been served.

Callers see the same result as if they'd each issued their own query; they just shared an upstream execution with siblings that happened to be in flight at the same time. This is the pattern. It is not caching — the shared result lives only for the duration of the in-flight window and is not reused for future arrivals. Caching and consolidation compose (see Composition below).

Shape¶

         caller 1 ──► proxy ────► in-flight map ──► upstream
         caller 2 ──► proxy ────► (existing entry) ───┘
         caller 3 ──► proxy ────► (existing entry) ───┘
                                      ▲
                             all 3 share the single
                             upstream query's result

Scope / eligibility¶

Not every query should consolidate. Eligibility typically requires:

Read-only — no UPDATE / INSERT / DELETE. Writes have side-effects that must execute per-caller.
Side-effect free query body — no SELECT ... FOR UPDATE (acquires locks), no functions with observable side-effects (stored-procedure calls that mutate state).
Snapshot-independent — either outside a transaction (auto-commit), or inside a transaction whose snapshot matches the in-flight query's snapshot (tight coupling).
Deterministic — or tolerant of shared clock skew. Queries containing NOW(), UUID(), RAND() return different values per execution and are typically excluded from consolidation (or consolidated with explicit "shared clock ok" relaxation).
Hash-normalizable — the query's semantics must be recoverable from a stable hash. Queries with heavy literal variation that the proxy can't strip-and-parameterise aren't consolidatable.

Composition with caching¶

The pattern composes cleanly with a caching proxy tier (patterns/caching-proxy-tier):

Scheme	On identical arriving queries	Trade-off
Cache only	Cache hit → instant. Cache miss → full stampede on invalidation.	Fast when warm; catastrophic cache-miss storms.
Consolidation only	Each first arrival runs upstream; siblings share.	Protects against concurrent-burst amplification but pays upstream cost every time the in-flight window turns over.
Cache + consolidation	Cache hit → instant. Cache miss → one upstream query; concurrent arriving callers share that single miss's upstream path.	Canonical: cache absorbs repeat hits across time, consolidation absorbs concurrent hits during in-flight windows. Cache-stampede prevention is a design byproduct.

The single-flight literature (e.g. golang.org/x/sync/singleflight) is the in-memory cousin of this pattern at the application altitude. Query consolidation is the SQL-wire-protocol instantiation of single-flight at the database-proxy altitude.

Canonical instance¶

Vitess's consolidator, as canonicalised on the wiki via (Jarod Reyes, PlanetScale, 2021-09-30). Reyes's YouTube framing:

"if 3 million people go to your YouTube video at once, Vitess will notice that multiple clients are simultaneously (or nearly simultaneously) attempting the same query and serve them all from the same connection."

The wiki treatment of the underlying primitive is concepts/query-consolidation. The Reyes post names the primitive and its motivation but does not disclose window duration, hash semantics, scope boundary, or correctness invariant — those remain to be canonicalised by a subsequent Vitess-internals post.

Forces / trade-offs¶

Latency floor vs throughput. Very short consolidation windows miss opportunities; longer windows add latency to the first arrival (it waits for siblings). The common implementation choice — merge any arrivals during the already-running upstream query's in-flight window, don't speculatively delay the first arrival — avoids the latency cost but misses some consolidation potential.
Hot-row amplification risk. A correctness bug in the consolidator (e.g. serving cached snapshot data past a legitimate refresh) amplifies N× because one bad result hits all N waiting callers. Consolidator code is on the critical correctness path.
Unfair starvation on heterogeneous callers. If one caller's request timeout is shorter than the shared upstream query's duration, that caller times out on a query it shared rather than initiated — the timeout budget is set by the slowest upstream execution, not the individual call.
Pool pressure still scales with unique queries. Consolidation is not a substitute for throttling or pool sizing when load is from diverse expensive queries rather than repeated identical ones.
Correctness depends on read-only scope hygiene. Implementations that accidentally consolidate a query with side-effects (stored-procedure call, SELECT ... FOR UPDATE) silently corrupt semantics.

When not to use¶

Workload has high query diversity. Nothing consolidates; the pattern adds proxy-tier latency and complexity for no benefit.
Callers demand per-call execution semantics (audit logging per query, billing per query, strict per-call isolation). Consolidation violates the one-execution-per-call assumption.
Backing store is already fronted by a consolidating layer. Re-consolidating above an already-consolidating proxy (caching CDN → consolidator → another consolidator → MySQL) adds latency without reducing upstream load, and can introduce subtle bugs if the two consolidators disagree on scope.

Seen in¶

— canonical first wiki instance. Vitess's consolidator as the database-proxy-tier implementation; motivated by cascading-outage-from-hot-row failure mode observed in prior-generation RDS / NoSQL customers. YouTube viral-video framing ("3 million people go to your YouTube video at once") as the canonical illustration.

concepts/query-consolidation — the underlying concept.
concepts/thundering-herd — the failure class this pattern prevents at the DB-proxy altitude.
concepts/connection-pool-exhaustion — the specific pool-drain failure this pattern caps.
patterns/caching-proxy-tier — sibling proxy-tier primitive; composes naturally with consolidation to eliminate both temporal and concurrent cache-miss storms.
patterns/protocol-compatible-drop-in-proxy — the general proxy-tier architectural move that makes consolidation deployable without client-side changes.
systems/vitess — the canonical implementation venue.
systems/planetscale — the productised consumer of the primitive.