CONCEPT Cited by 1 source
Proxy-transparent failover¶
Definition¶
Proxy-transparent failover is the user-facing property that when a database node is replaced (due to crash, promotion, upgrade, or scale event), clients do not need to change connection strings, reconfigure DNS, or restart applications. A query-proxy tier in front of the database cluster owns the stable endpoint and reroutes new connections to the current live node on each failover; in-flight query state may be lost, but the endpoint does not.
Brian Morrison II's canonical framing across both Aurora and PlanetScale (2024-01-24):
"In the same light, PlanetScale and Aurora have dedicated query proxy services that automatically reroute traffic trying to access a node. This minimizes any downtime clients may experience based on the failure, making them more transparent." (Source: sources/2026-04-21-planetscale-planetscale-vs-amazon-aurora-replication)
Two canonical implementations¶
- PlanetScale: Vitess's VTGate. A proxy tier that owns MySQL-wire-protocol connections from clients, maintains its own MySQL connections to the tablets (primary
- replicas), and reroutes in-flight routing on topology change. VTGate's topology awareness is the proxy ingredient; see concepts/global-routing-for-databases and systems/planetscale-global-network for the edge-facing surface.
- Aurora: RDS Proxy (and Aurora's own cluster endpoints). Aurora's cluster endpoint always points at the current writer; RDS Proxy adds connection pooling + failover resilience in front of it. The architectural role is the same: client keeps one URL; the proxy knows which node is live.
Both are instances of the wider patterns/cdn-like-database-connectivity-layer — a CDN-shaped connectivity layer that sits between application and database, with its own topology state + rerouting logic.
Why it matters¶
Without a proxy tier, node replacement surfaces to the application as connection errors + DNS propagation delays + manual reconfiguration. Retry loops, ORM connection pools, and reconnection storms amplify the event's user-visible duration.
With a proxy tier:
- Application sees a stable hostname — the proxy's address.
- In-flight connections fail (or are buffered — see concepts/query-buffering-cutover) but new connections immediately reach the new primary.
- Total perceived downtime drops from seconds-to-minutes (DNS + manual intervention) to hundreds-of-milliseconds (proxy's topology-update reaction time).
Structural costs¶
- Added network hop — every query traverses the proxy tier. See concepts/proxy-tier-latency-tax.
- Proxy-tier availability becomes a SPOF unless the proxy itself is horizontally scaled and load-balanced.
- Connection-pool semantics change — proxy-managed connections may behave differently than direct MySQL connections (session variables, transactions, prepared statements can all be affected).
Seen in¶
- sources/2026-04-21-planetscale-planetscale-vs-amazon-aurora-replication — Brian Morrison II (PlanetScale, 2024-01-24). Canonical wiki disclosure that PlanetScale and Aurora converge on the same user-facing failover abstraction via dedicated query-proxy services, despite wildly different underlying replication substrates.
Related¶
- systems/vitess — the VTGate proxy tier.
- systems/planetscale-global-network — the PlanetScale edge-facing connectivity layer that fronts VTGate globally.
- systems/amazon-aurora — the Aurora cluster endpoint + RDS Proxy implementation.
- systems/aws-rds-proxy — AWS's managed proxy product for RDS/Aurora.
- patterns/cdn-like-database-connectivity-layer — the architectural pattern.
- patterns/zero-downtime-reparent-on-degradation — the related orchestration pattern where VTorc promotes a replica behind the proxy.
- concepts/query-buffering-cutover — the optional behaviour of holding queries at the proxy across a cutover.
- concepts/unplanned-failover-playbook — the operational playbook the proxy supports.