Skip to content

CONCEPT Cited by 1 source

Query buffering cutover

Definition

Query buffering cutover is the technique of briefly holding ("buffering") incoming client queries at a proxy layer while the backend behind the proxy is swapped from one database to another. The client never sees an error or a dropped connection; once the swap is complete (typically sub-second), the buffered queries are released against the new backend and normal service resumes. It is the last-mile primitive that converts a carefully-prepared database migration into a literally zero-downtime one from the application's perspective.

Contrast with:

  • Connection-draining cutover — existing connections finish naturally, new connections go to the new system. Doesn't work for long-lived pooled connections, makes in-flight queries visible to the migration.
  • Error-retry cutover — cut over abruptly, rely on client-side retry. Visible error spikes, retry-sensitive workloads suffer.
  • Query buffering cutover — queries pause at the proxy, the proxy dispatches them against the new backend once routing flips. Client sees a very brief latency spike instead of errors.

Architectural shape

Requires a proxy layer in front of the database that:

  1. Terminates the client connection protocol (MySQL wire, Postgres wire, HTTP, etc).
  2. Can hold inbound queries in a queue per logical keyspace / table / route with bounded memory and a bounded time budget.
  3. Can be told atomically to re-route queries from route A to route B.
  4. Releases the buffered queries against the new route once the cutover completes.

The proxy usually has its own health-and-timeout guarantees — if the cutover takes too long, buffered queries are either released against the old backend (abort cutover) or timed out cleanly with an error (explicit abort).

Canonical wiki instances

  • Vitess / VTGateVTGate Buffering is the canonical implementation. Used during MoveTables SwitchTraffic cutover: writes on source keyspace are stopped, queries buffer in VTGate, replication catches up fully, routing rules flip, buffered queries execute against the new keyspace. "The query buffering done here is the last part that allows the entire migration to be done without any downtime." (Source: .) Also used during planned primary failover inside a Vitess keyspace — a connection-bearing reparent. Typical cutover takes "less than 1 second."

  • PlanetScale for Postgres proxy layer — PlanetScale's proprietary Postgres proxy provides "automatic failovers, query buffering, and connection pooling via our proprietary proxy layer, which includes PgBouncer for connection pooling." (Source: .) Same shape as VTGate buffering, different engine — the proxy layer buffers queries during automatic failovers so clients don't see errors even when the primary is being swapped.

Seen in

  • Canonical framing of query buffering as static-stability- for-in-flight-operations. Max Englander (PlanetScale, 2025-07-03) names query buffering as the mechanism that makes weekly failover drills invisible to customers: "Query buffering minimizes or eliminates disruption during failovers." Canonical wiki framing: query buffering is the static-stability principle applied to an in-progress operation — the client's "last known good state" is "my query was accepted and will complete", and the proxy preserves that invariant across the topology change rather than surfacing a connection error. Without query buffering, the weekly failover drill would leak errors to every client mid-request during every ship cycle; with it, failovers are architecturally invisible.
  • sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-4-establishment-and-revocation — canonical wiki framing of VTGate query buffering as the application-transparency primitive under graceful leader demotion (see patterns/graceful-leader-demotion). Sougoumarane pairs VTGate buffering with VTTablet lameduck mode in a two-tier composition: "If a PRS is issued, the low level vttablet component of vitess goes into a lameduck mode where it allows in-flight transactions to complete, but rejects any new ones. At the same time, the front-end proxies (vtgate) begin to buffer such new transactions. Once PRS completes, all buffered transactions are sent to the new primary, and the system resumes without serving any errors to the application." Canonicalises the pattern beyond migrations into the broader leader-revocation family — query buffering is the proxy-tier primitive that makes any reparent path (planned or on-degradation) application-transparent.
  • — canonical wiki description of VTGate's role as the query-buffering proxy during MoveTables SwitchTraffic cutover. Explicitly identified as "the last part that allows the entire migration to be done without any downtime." Buffered queries are executed against the new keyspace immediately after the routing-rule swap completes.
  • — companion use case on the Postgres side: PlanetScale's proprietary proxy buffers queries during automatic failovers so clients don't see the primary swap as an error. Same pattern, different engine substrate.
Last updated · 542 distilled / 1,571 read