SYSTEM Cited by 11 sources
Vitess¶
What it is¶
Vitess (vitess.io) is an open-source database clustering system for horizontal scaling of MySQL, originally built at YouTube (~2010) to handle its MySQL workload and later donated to the CNCF. It speaks the MySQL wire protocol, shards data across many backing MySQL instances, and handles query routing, connection pooling, and operational primitives (schema changes, cluster resharding, failover) on top.
Why it shows up on this wiki¶
Vitess is the substrate behind PlanetScale MySQL. In the Cloudflare × PlanetScale integration (2026-04-16), "Developers can choose from two of the most popular relational databases with Postgres or Vitess MySQL" (Source: sources/2026-04-16-cloudflare-deploy-postgres-and-mysql-databases-with-planetscale-workers). The article treats Vitess as implementation detail of PlanetScale MySQL — customers interact with PlanetScale's surface, not Vitess directly.
Minimal-viable page for now: Vitess has enough architectural substance (MySQL sharding, VTGate, VTTablet, vschema, reparenting) to deserve a deeper treatment if a systems-tier post about its internals is ingested in future. Until then this page exists to anchor cross-references and make the MySQL-side story on PlanetScale navigable.
Seen in¶
-
— canonical 2021-07-20 PlanetScale-chose-Vitess thesis post. Deepthi Sigireddi (PlanetScale, Vitess Maintainer & TSC member) canonicalises the four-safety-feature quad Vitess adds over vanilla MySQL — "automatic row limits, hot row protection, query consolidation and non-blocking schema changes which reduce the likelihood of high load bringing down the database. These are all features that vanilla MySQL cannot and does not provide." Single prose-level enumeration of the quad anywhere in the PlanetScale corpus; concepts/hot-row-protection and concepts/automatic-row-limit already back-cite this post as their origin quote. Also canonicalises the sharding-workflow primitive catalogue — materialization, movetables, migration, reshard — that the later PlanetScale Workflows UI product wraps. YouTube-origin + Slack-100% + JD.com triad of production flagships named inline. Canonical Vitess-expert-shortage framing: "there is a shortage of Vitess experts who can run such a system in production. … many of the people who try it out never end up going into production for this reason." Canonicalises concepts/developer-platform-over-bare-substrate as the three-phase product-narrative arc (provision-in-10-seconds → ship-schema-changes-safely → scale-via-sharding-later), with the 10-second-onboarding-on-Kubernetes target as a design constraint (the "secret sauce from PlanetScale" that gets mechanised much later in the 2024 fleet-scale operator post).
-
— canonical pedagogy-101 wiki disclosure of the three-Vitess-primitives triad. Brian Morrison II (PlanetScale, 2023-08-23) canonicalises
vtctldas the third named Vitess primitive alongside VTGate and VTTablet — "The entire cluster is controlled by avtctldinstance, a management interface that our internal systems communicate with to perform administrative operations." First wiki source to name systems/vtctld with its own page despite ~30 prior references across the Vitess corpus. Also canonicalises the branch-as-independent-Vitess- cluster substrate behind PlanetScale's branching model verbatim: "Every database and branch on PlanetScale is an independent cluster"; and thevtctld-to-vtctldschema-clone mechanism: "when you create a branch, we'll spin up a new Vitess cluster for it and (using thevtctldcomponent of the two clusters) apply the schema of the source database branch with the one you just created!" Pedagogy-101 altitude companion to Morrison II's 2022-10-21 What-is-Vitess post (that post = VTGate + VTTablet + connection-pooling; this post = all three primitives + Kubernetes + branching + backup + replica + Insights + security at product-overview altitude). -
— canonical wiki disclosure of Vitess's planner-altitude fuzzing posture. Arvind Murty (PlanetScale / Vitess, summer 2023 internship under Andrés Taylor; published 2024-04-09) canonicalises the bespoke random-query fuzzer shipped as vitessio/vitess #13260 that differentially tests the VTGate planner against MySQL. Explicitly ruled out SQLancer on two grounds: (1) Vitess must be bug-for-bug compatible with MySQL ("Vitess ideally should perfectly mimic MySQL, quirks included"), not standards-compliant — so SQLancer's logic-bug oracle is the wrong substrate (see concepts/sqlancer-logic-bug); (2) VSchema is a first-class planner-layer axis — the sharding layout changes how Vitess plans queries, and SQLancer doesn't model sharding. Fuzzer architecture: bespoke random-query generator on
EMP/DEPTtables sharded onEMPNO/DEPTNO; generated query run against both Vitess and MySQL (unsharded equivalent); byte-for-byte result + error comparison; any divergence reported as candidate bug. Expression generator respects semantic-role constraints — aggregations only inSELECT,GROUP BY,ORDER BY,HAVING; derived-table columns tracked separately. Minimal reproducers produced via Andrés Taylor's AST query simplifier (brute-force AST-node-removal delta-debugger; see Taylor's 2022 blog post). Murty's simplifier contribution (vitessio/vitess #13636) extended it to end-to-end tests by threading VSchema information through — the original simplifier was designed for unit tests with known-schema fixtures. Future work: randomise schema + VSchema; retire thetestFailingQueriesflag that calcified known-failure skip lists. This is the planner-altitude sibling to the evalengine fuzzer canonicalised by : evalengine fuzzer tests scalar SQL expression evaluation (AST interpreter vs bytecode VM vs MySQL C++); planner fuzzer tests whole-query planning + execution (Vitess plan-then-execute vs MySQL direct-execute). Together they canonicalise Vitess's two-altitude fuzzing posture on the wiki — the first such multi-altitude DBMS-fuzzing disclosure. Team attribution: Andrés Taylor (mentor), Harshit Gangal, Florent Poinsard, Manan Gupta (Vitess query-serving team). -
— PlanetScale (2023-10-05). Co-canonical with the same-day Aurora post below: identical paragraph re-asserting the 5-hop Vitess data path ("application, to a load balancer, to VTGate, to VTTablet, and then finally to MySQL") and the division of labor — VTGate = "application-level query routing layer", VTTablet = "middleware between VTGate and MySQL" managing connection pooling + health checks + publishing to the topo-server. The architectural substance of the two posts on this front is identical; they address different competitors. This ingest is the trigger for promoting systems/vtgate, systems/vttablet, and concepts/vitess-topo-server to dedicated wiki pages — previously ~18 sources referenced these components by name without architectural pages. The new patterns/query-routing-proxy-with-health-aware-pool pattern distils the stateless-router + pool-owning- middleware + shared-state-store topology.
-
— PlanetScale (2023-10-05). Canonical wiki enumeration of the full Vitess request-flow hop sequence: "application → load balancer → VTGate → VTTablet → MySQL". Role attribution verbatim: VTGate = "application-level query routing layer"; VTTablet = "behaves as middleware between VTGate and MySQL" — manages connection pooling, performs MySQL health checks, publishes state to the topo-server. topo-server = role-state registry that VTGate queries to determine available tablets + their roles and reroute traffic as needed. PlanetScale's edge infrastructure sits in front of VTGate as a frontend load balancer, "terminating MySQL connections in the closest edge location" — first wiki citation of the end-to-end hop-sequence framing (edge + VTGate + VTTablet + MySQL). Complements the 2022-11 van Dijk (benchmark anchor for the architecture) + the 2023 Gangal [[sources/2026-04-21- planetscale-connection-pooling-in-vitess|connection- pooling post]] (VTTablet pool-design mechanism) with the vendor-comparison-altitude hop-sequence enumeration. Vendor-comparison against Amazon Aurora; no net-new Vitess internals beyond the hop- sequence canonicalisation.
-
— canonical benchmarked empirical anchor for VTTablet's "nearly limitless connections" scaling claim. Liz van Dijk (PlanetScale, 2022-11-01) names VTTablet connection pooling as the in-cluster tier behind PlanetScale's sustained 1,000,000 concurrent connections benchmark: "Vitess and PlanetScale offer connection pooling on the VTTablet level. This scales alongside your cluster, and also allows for connection requests to be queued up there when a sudden application scale-up starts sending queries from a very large amount of horizontally spawned processes. This keeps the underlying MySQL processes safe from a memory management standpoint, and allows you to keep adding workers as needed to scale the application." Complements the 2023 Harshit Gangal post (three-era VTTablet pool design) with the end-to-end scaled number — the mechanism post canonicalises how, this post canonicalises how much. The [[patterns/two-tier- connection-pooling|two-tier pool]] (VTTablet in-cluster + PlanetScale Global Routing Infrastructure at edge) is the specific substrate; 62.5× above RDS MySQL's 16k ceiling. Canonical new concepts on memory economics (concepts/max-connections-ceiling, concepts/memory-overcommit-risk) and benchmark design (concepts/lambda-fanout-benchmark).
-
— twelfth canonical Vitess-internals disclosure on the wiki, filling the product-surface axis for Vitess's workflow family. Ben Dicken's 2024-11-07 post first-publicly enumerates the full Vitess workflow family (
MoveTables,Reshard,Materialize,LookupVindex,Migrate) and exposesMoveTablesthrough PlanetScale Workflows as a dashboard UI — a new product surface over the samevtctldclient-accessible primitives. Also first canonical wiki disclosure of the Cluster configuration UI as the self-service substrate for sharded-keyspace creation + custom VSchema + routing rules. The two UIs cover the full sharding lifecycle on a PlanetScale database (topology + data motion) previously gated behind Enterprise plans orvtctldclientCLI expertise. -
— Sam Lambert (PlanetScale CEO, 2022-08-02) anchors the "we built on a mature substrate" argument in Vitess's named hyperscaler adopters: "Vitess has been adopted by GitHub, Slack, Etsy, Roblox, and many more. PlanetScale are also the maintainers of Vitess." First wiki citation from the 2022-era PlanetScale CEO voice of the canonical four-name adopter list; the argument shape is "pick an OSS substrate already pressure-tested by webscalers, then contribute as maintainer." Post is marketing-voice and doesn't canonicalise any Vitess internals — its primary wiki-durable contribution is the deploy-safety warn-on-drop pattern which is implemented on top of Vitess's online-DDL / deploy-request flow.
-
— canonical wiki disclosure of the VTGate planner's phase structure. Andres Taylor (PlanetScale / Vitess core, 2024-07-22) canonicalises [[concepts/planner-phase- ordering|planner phase ordering]] + [[concepts/fixed- point-tree-rewriting|fixed-point tree rewriting]] as the planner's load-bearing architecture: "We have several phases that run sequentially. After completing a phase, we run the push-down rewriters, then move to the next phase, and so on. … This involves repeatedly rewriting the tree until no further changes occur during a full pass of the tree, a state known as the fixed-point." Two phases named explicitly: initial + split aggregation. Bug retrospective: a user query
SELECT sum(user.type) FROM user JOIN user_extra ON user.team_id = user_extra.id GROUP BY user_extra.id ORDER BY user_extra.idOOMed VTGate because the "push ordering under aggregation" rewriter fired in the initial phase, wedgingOrderingbetweenAggregatorandApplyJoin— "Ordering can only be pushed down to the left hand side. Ordering is blocking the aggregator from being pushed down, which means we have to fetch all that data, and sort it to do the aggregation." Fix = gate the ordering rewriter behind the split-aggregation phase so [[patterns/aggregation- pushdown-under-join|aggregation-pushdown-under-join]] fires first. Canonical new pattern: [[patterns/phase- gated-planner-rewriter]] — the meta-recipe for resolving rewriter interference by phase reordering rather than algorithm modification. "This doesn't stop the 'ordering under aggregation' rewriter from doing its job, it just has to wait a bit before doing it." Shipped as vitessio/vitess #16278. Twelfth canonical Vitess-internals disclosure on the wiki — fills the planner-phase-architecture + OOM- as-pushdown-failure-symptom axis. Sequel to the 2022-06-24 sibling: same author, same mechanism, two years of production experience later. Canonical framing: "complex planners are not flat rewriter soup; they are phase-ordered fixed-point loops, and bugs live at the phase boundaries." -
— eleventh canonical Vitess-internals disclosure on the wiki (after evalengine, data motion, query routing, throttler, release, fork management, authz + schema revert, propagation, capstone, and backup) — fills the VTGate query planner + aggregation-pushdown-under-join axis. Andres Taylor (PlanetScale / Vitess core, 2022-06-24) canonicalises: (a) the canonical framing that VTGate is a distributed query engine, not a dumb proxy — "Vitess is not just a dumb proxy layer though — it can also run some of the operations instead of sending them on"; (b) the VTGate query planner's pushdown discipline — "the planner tries pushing as much work down to MySQL as possible"; (c) the local- global aggregation decomposition primitive Vitess has used since early on (per-shard
count(*)+ VTGateSUM); (d) the new push-aggregation- under-join rewrite shipped in vitessio/vitess #9643, adapted from Galindo-Legaria & Joshi's SIGMOD 2001 paper Orthogonal Optimization of Subqueries and Aggregation; (e) nested-loop join as the canonical VTGate join shape when shard keys don't align; (f) theProjectoperator's use of evalengine for theSUM(order_line.amount) * count(*)multiplier arithmetic; (g) the research-to-production meta-pattern — "Someone out there has done a ton of work on something closely related to what we are doing, and all we have to do is adapt the algorithm to our circumstances." 21-year gap from SIGMOD paper to Vitess production; the paper's value moved from marginal (single-node) to load-bearing (cross-shard) as the economic regime shifted. Canonical new pages: 4 concepts (concepts/local-global-aggregation-decomposition, concepts/push-aggregation-under-join, concepts/vtgate-query-planner, concepts/nested-loop-join) + 3 patterns (patterns/local-global-aggregation-split, patterns/aggregation-pushdown-under-join, patterns/research-to-production-algorithm-adoption). -
sources/2026-04-21-planetscale-faster-backups-with-sharding — canonical wiki disclosure of Vitess's backup architecture and its shard-parallel property. Ben Dicken canonicalises VTBackup as the Vitess program that orchestrates per-shard backups on a dedicated ephemeral MySQL instance — not on the production primary or an existing replica. The seven-step choreography (restore-previous → spin-up-MySQL → catchup-from-primary- VTGate → new-full-backup) is composed with PlanetScale Singularity (the ephemeral-compute provisioner) and the Vitess
builtinbackup engine into PlanetScale's production backup pipeline. Tenth canonical Vitess-internals disclosure on the wiki (after evalengine, data motion, query routing, throttler, release, fork management, authz + schema revert, propagation, and capstone) — fills the backup / restore operational-primitive axis. Measured shard-parallel scaling: 1 shard × 161 GB = 30 min 40 s; 32 shards × ~625 GB = 1 h 39 min 4 s (~6.7 GB/s aggregate, ~210 MB/s per shard); 256 shards × ~900 GB = 3 h 37 min 11 s (~35 GB/s aggregate, ~137 MB/s per shard). Per-shard throughput ~constant; aggregate scales linearly with shard count. Canonical new patterns: [[patterns/dedicated-backup-instance-with- catchup-replication]] (the seven-step choreography) + patterns/shard-parallel-backup-and-restore (backup and restore both parallelise per-shard). Canonical new concepts: concepts/shard-parallel-backup (the load- bearing property), [[concepts/primary-vs-replica-as- replication-source]] (PlanetScale chooses primary-as- source; delta is small post-first-backup), concepts/point-in-time-recovery (Vitess PITR uses backups + binlog tail), [[concepts/replica-creation-from- backup]] (non-obvious backup role under HA), concepts/backup-encryption-at-rest (PlanetScale invariant), concepts/soft-delete-vs-hard-delete (backups-as-escape-hatch). Restore inherits the same shard-parallel property — multi-day unsharded restores become hours-long sharded restores. -
sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-8-closing-thoughts — canonical capstone framing of Vitess as the worked composition of Sugu Sougoumarane's Consensus algorithms at scale framework. Part 8 consolidates the eight-part series into two architectural recommendations — [[patterns/ pluggable-durability-rules|pluggable durability]] and lock-based over lock-free at scale — and names Vitess as the canonical production instance of both: "In Vitess, we make full use of the above options and flexibilities. For example, durability rules are a plugin for vtorc. The current plugin API is already more powerful than other existing implementations." The four-way mapping from the Part-8 four-advantage lock-based argument to Vitess features: (1) Vitess Operator's graceful-failover mechanism for software deploys = [[patterns/graceful- leader-demotion]]; (2) tablet-add/remove via the Vitess Operator = node membership coordination; (3) direct-to- leader routing for consistent reads = [[concepts/ consistent-read]]; (4) VTOrc's inherited Orchestrator anti-flapping = concepts/anti-flapping. All four advantages instantiated in one composite system. Ninth canonical Vitess-internals disclosure on the wiki (after evalengine, data motion, query routing, throttler, fork management, backup / observability, authz, online-DDL + schema revert, and Part-7 propagation) — fills the capstone-architectural-recommendations axis. VTOrc full-auto-pilot roadmap disclosed: "There are still a few corner cases that may require human intervention. We intend to enhance vtorc to also remedy those situations. This will put Vitess on full auto-pilot."
-
sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-7-propagating-requests — canonical wiki disclosure of Vitess's propagation stack and its architectural lineage. Sugu Sougoumarane canonicalises request propagation as the load-bearing final concern of consensus ("We have saved the most difficult part for last") and names Vitess's elector, VTOrc, as a customised fork of Orchestrator — an architectural- lineage disclosure the prior series parts hadn't made explicit. Vitess inherits Orchestrator's [[concepts/anti- flapping|anti-flapping]] rules that compensate for MySQL's faithful-GTID-propagation violating the strict per-request- new-version rule; this is the canonical production instance of the patterns/external-metadata-for-conflict-resolution pattern at large scale. Sugu adds: "we also intend to tighten some of these corner cases to minimize the need for humans to intervene if complex failures ever happen to occur." Eighth canonical Vitess-internals disclosure on the wiki (after evalengine, VReplication / VDiff / MoveTables, Consistent Lookup Vindex, Throttler, Vitess 21 release, VStream CDC, and Part 4 establishment/revocation) — fills the propagation + anti-flapping lineage axis of Vitess's leader-change stack. Graceful-propagation-before-demotion extends the Part-4 graceful-demotion pattern on PRS with a propagation-completion invariant — leader waits for all outstanding requests to reach the new leader's required followers before stepping down, eliminating the propagation race entirely for planned changes.
-
sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-4-establishment-and-revocation — canonical wiki disclosure of Vitess's leader-election operational primitives and the algorithmic rationale behind them. Sugu Sougoumarane (Vitess co-creator) names two user-facing shard-level commands —
PlannedReparentShard(PRS) for planned / graceful leadership transitions andEmergencyReparentShard(ERS) for unplanned / fencing failover — as the Vitess instantiation of the broader revoke-from-establish pattern. PRS mechanism (the graceful path): vttablet enters lameduck mode — "allows in- flight transactions to complete, but rejects any new ones" — while vtgate simultaneously buffers new transactions via query buffering; once PRS completes the buffered transactions flush to the new primary "and the system resumes without serving any errors to the application." ERS mechanism (the emergency path): used when the primary is unreachable; revocation is achieved by fencing the followers rather than asking the primary to step down. Canonical wiki application of patterns/graceful-leader-demotion on the PRS side. Seventh canonical Vitess-internals disclosure on the wiki after evalengine (2025-04-05), VReplication / VDiff / MoveTables (2026-02-16), Consistent Lookup Vindex (2026-04-21), Throttler trilogy (2026-04-21), Vitess 21 release notes (2026-04-21), and VStream CDC (2026-04-21) — fills the leader election / reparenting axis. Post is theoretical / conceptual rather than a production retrospective; internal PRS / ERS mechanism details (topology-service interactions, candidate-selection logic, errant-GTID handling, buffer sizing, drain timeouts) are elided in favour of the higher-level revoke / establish framing. -
— canonical wiki first-citation of Vitess's consolidator, the query-routing-tier primitive that merges multiple simultaneously-arriving identical
SELECTqueries into a single upstream MySQL execution and fans the result back to every waiting caller. Jarod Reyes (PlanetScale, 2021-09-30) names the primitive and its motivation without disclosing internals: "Vitess also makes sure that identical requests are automatically served to multiple clients simultaneously through a single query. … if 3 million people go to your YouTube video at once, Vitess will notice that multiple clients are simultaneously (or nearly simultaneously) attempting the same query and serve them all from the same connection." Canonicalised on the wiki as concepts/query-consolidation + patterns/consolidate-identical-inflight-queries. The primitive prevents the specific thundering-herd shape where a slow hot-row query cascades into pool exhaustion — canonical YouTube-viral-video framing. Post also references the 1M-concurrent-connections benchmark as empirical anchor for Vitess's "nearly limitless" connection scaling claim, vs RDS MySQL's 16,000-connection ceiling. Mechanism specifics (consolidation window, hash semantics, scope boundary, correctness invariant) remain to be canonicalised by a subsequent Vitess-internals ingest — the Reyes post gives existence + motivation, not spec. -
— canonical wiki attribution of Vitess unmanaged tablets as the composition primitive under PlanetScale Database Imports (Phani Raju, 2021). First wiki page naming the Vitess feature that lets Vitess attach to an externally-managed MySQL without owning its process lifecycle — the mechanism by which workflow machinery (VReplication, MoveTables, routing rules) operates against foreign MySQL endpoints. Also canonicalises the early database-as-data-router cutover shape (see patterns/database-as-data-router) where the destination transparently proxies writes back to the source during bidirectional validation. The 2021 post is the earliest public disclosure of the import design; Matt Lord's 2026 deep-dive adds the mechanism details and evolves cutover to an atomic
SwitchTraffic. -
sources/2026-04-16-cloudflare-deploy-postgres-and-mysql-databases-with-planetscale-workers — named as the MySQL engine under PlanetScale's "Vitess MySQL" offering in the Cloudflare integration.
-
— Vitess composes with PlanetScale's transactional SPFresh-inside-InnoDB vector index to support sharded transactional vector indexes: "Together with Vitess, PlanetScale's sharding layer, this allows the construction and efficient querying of huge vector indexes that are fully integrated with all the relational data in your database and can be used with
JOINs andWHEREclauses while the underlying vector data is continuously updated." First wiki datum on Vitess sharding an ANN index rather than just B+tree rows. -
— canonical disclosure of Vitess's SQL expression evaluation engine (evalengine) living inside
vtgate. Vitess must evaluate scalar SQL sub-expressions locally whenever they operate on cross-shard aggregate results —HAVING,GROUP BY-clause arithmetic on aggregates, filter predicates over cross-shard joins. Vicent Martí retrospective on replacing the original AST interpreter with a bytecode-less virtual machine — compile each SQL sub-expression's AST into a[]func(*VirtualMachine) intslice of Go closures (patterns/callback-slice-vm-go), driven by planner-derived static type specialization using the MySQL information schema. Performance: geomean −48.60% sec/op vs the original AST interpreter, faster than MySQL's C++ implementation on 4/5 benchmarks, zero memory allocations on 4/5 benchmarks. Original AST interpreter retained permanently as deoptimization fallback (for value-dependent type promotions like-BIGINT_MIN → DECIMAL) + one-shot evaluator (for single-pass uses like constant folding) + fuzz-oracle sibling (Vitess's fuzzer compares the VM, the AST interpreter, and MySQL's C++ engine against each other — it has routinely surfaced bugs in MySQL itself that Vitess upstreams). First canonical wiki disclosure of a Vitess sub-engine. -
— Vitess explicitly rejected for Postgres. PlanetScale's launch announcement for PlanetScale for Postgres canonicalises the claim that Vitess's design is MySQL-specific by construction and cannot simply be ported across engines: "Vitess' achievements are enabled by leveraging MySQL's strengths and engineering around its weaknesses. To achieve Vitess' power for Postgres we are architecting from first principles." Load-bearing wiki datum that a sharding layer's correctness/performance derives from engine-specific properties (replication protocol, binlog / WAL format, MVCC shape, DDL characteristics). PlanetScale's Postgres sharding successor is the separate Neki system (neki.dev, waitlist-only). Establishes the wiki's canonical narrative of Vitess: the architectural leverage Vitess provides is MySQL-shaped and doesn't port; per-engine-from-scratch sharding is the design stance (canonicalised as patterns/architect-sharding-from-first-principles-per-engine).
-
— canonical wiki deep-dive on Vitess's data-motion subsystems. Matt Lord (Vitess core maintainer) documents the three primitives Vitess uses for every zero-downtime migration, reshard, and online schema change: VReplication (per-stream snapshot + binlog-catch-up + continuous replication with per-stream GTID tracking persisted in sidecar
vreplication/copy_statetables); VDiff (zero-downtime consistency verification via source + per-target-shard consistent snapshots at matching GTID positions with concurrent full- table scans); MoveTables (the user-facing workflow wrapper composing VReplication + VDiff +SwitchTrafficcutover with query buffering + reverse- VReplication-workflow creation +ReverseTraffic+Completelifecycle). First canonical wiki description of Vitess's role in data motion — prior Vitess ingests covered expression evaluation (evalengine) and vector- index composition (SPFresh-inside-InnoDB-sharded-by- Vitess). Canonical wiki instance of VTGate as the query- buffering proxy at cutover and of topology-server-mediated workflow locks for cross-keyspace atomicity. Sibling to the 2025-04-05 evalengine post on the Vitess-internals axis — the two ingests together bracket the VReplication (data motion) and evalengine (expression evaluation) halves of Vitess's non-routing runtime. -
— **canonical wiki deep-dive on Vitess's query-routing + transactional-write layer: Vindexes
-
the Consistent Lookup Vindex mechanism. Harshit Gangal + Deepthi Sigireddi (Vitess core maintainers) walk through how Vitess maps rows to shards via Vindex
column → [keyspace_id](<../concepts/keyspace-id.md>)tables, and how the Consistent Lookup Vindex avoids 2PC on every DML by ordering commits across three MySQL connections (Pre/Main/Post) with rollback in the same order — canonical wiki instance of patterns/ordered-commit-without-2pc. The authoritative-user-table invariant ("The lookup Vindex table may be inconsistent with theusertable but the results returned for the query remained consistent with theusertable") is what makes the weaker-than-2PC guarantee sufficient for user-facing queries. Canonical wiki introduction of orphan lookup rows (lazily reclaimed on the next colliding insert via afor update+ authoritative-table verification + repoint sequence), the identity-update no-op optimisation (to avoid a DML deadlocking against itself across its ownPost/Preconnections), and the two-statements-in-one-transaction deadlock limitation that identity-update no-op doesn't cover. Third canonical Vitess-internals disclosure on the wiki after the 2025-04-05 evalengine post + the 2026-02-16 VReplication / VDiff / MoveTables post — together these three bracket the query-routing, data-motion, and expression-evaluation** subsystems of Vitess's non-storage runtime. -
sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-2 — canonical wiki disclosure of Vitess's tablet throttler architecture. Shlomi Noach (Vitess maintainer) documents the topology: one throttler per
vttablet(which maps 1:1 to a MySQL database server); per-host throttlers collect local host + MySQL metrics; the shard-primary's throttler aggregates every replica throttler's metrics to represent the "shard" throttler. Canonical wiki instance of per-host + shard-primary rollup hierarchy and of the host-scope vs shard- scope distinction that maps workloads to the right throttler: "Thus, a massive write to the primary is normally throttled by replication lag, a metric collected from all serving replicas. Clients consult with the primary throttler … In contrast, some operations only require a massive read, which can take place on a specific replica … The client can therefore suffice in checking the throttler on the specific replicas." Deliberate no cross-shard throttler communication — fan-out bounded by shard topology. Also the wiki's canonical source for: replication-lag heartbeats (thept-heartbeat-style timestamp- injection technique that dominates MySQL lag measurement, and its binlog-cost tradeoff); throttler hibernation (slow or stop metric collection + heart- beat injection during idle periods, re-ignite on first client request, first few checks reject on stale data, client retry is the compensation); fail-open vs fail-closed HA semantics for the client side; and the layered polling-interval staleness bound for agent-mediated topologies (1 Hz agent + 1 Hz throttler → up to 2 s stale vs 1 s for direct access). Fourth canonical Vitess-internals disclosure on the wiki alongside evalengine (expression evaluation), VReplication / VDiff / MoveTables (data motion), and Consistent Lookup Vindex (query routing + transactional writes) — the throttler post opens a new axis: load admission control as the fourth Vitess non-storage subsystem. Noach frames Vitess's design stance explicitly: client retry is load-bearing throughout (free-pass windows after successful checks, cold-start windows after hibernation, bounded wait under throttler-unavailable). -
— Vitess 21 release announcement (PlanetScale Vitess Engineering Team, same publication day as the Consistent Lookup Vindex + Throttler-series posts). Canonicalises five architectural shipping-points across the release: (1) Atomic distributed transactions reintroduced with deeper integration into VReplication
-
Online DDL +
MoveTables+Reshard— the strong-atomicity endpoint of the cross-shard-write design space, complementing the same-day-shipping ordered-commit Consistent Lookup Vindex (weaker-but-cheaper endpoint); (2) Experimental recursive CTEs landing in the evalengine SQL surface — first wiki disclosure; (3) Reference-table materialization as a first-classMaterializeprimitive via VReplication — new canonical patterns/reference-table-materialization-via-vreplication pattern; (4) Dynamic VReplication workflow configuration — runtime knobs moved off VTTablet process flags into the workflow control plane, canonical wiki instance of control-plane / data-plane separation at the config-surface altitude; (5) Multi-metric throttler shipping with v20 wire-format backward compat (v22 removes compat) — companion datum to the Anatomy of a Throttler series' design-space exposition. Additional new Vitess subsystems canonicalised on the wiki:mysqlshelllogical backup engine (contributed by Slack Engineering — first external-company-contributed engine into Vitess's backup subsystem, materialised via the new patterns/logical-backup-engine-plug-in pattern);vexplainobservability surface with newvexplain trace(JSON execution trace for distributed query analysis) +vexplain keys(column-usage analysis for indexing / sharding candidates, usable against standalone MySQL too). Vitess Operator 2.14 adds VTGate HPA autoscaling, Kubernetes 1.31 support, and per-keyspace Docker image selection — first wiki datum on VTGate horizontal autoscaling under a Kubernetes HPA. VTOrc gains an errant-GTID count metric as a precursor-of- reparent visibility primitive. Fifth canonical Vitess-internals disclosure on the wiki — this one spans multiple subsystems (cross-shard writes, data motion, load admission, backup, observability, operator tooling) whereas prior disclosures were single-axis deep-dives. -
— Vitess's OSS / private-fork management story. Manan Gupta retrospective on how PlanetScale keeps their private Vitess fork aligned with upstream OSS. Not a Vitess-internals disclosure — Vitess is the subject of the fork, not the infrastructure. Three-stage evolution: (1) a weekly GitHub Action cherry-picking the whole private diff onto OSS
main; (2) git-replay, an internal batch tool with conflict- resolution memoisation; (3) the Vitess cherry-pick bot with a GitHub Actions cron + PlanetScale-DB state. Formalises the branch-pair mirror topology OSSmain↔ privateupstream, OSSrelease-x.0↔ privatelatest-x.0(concepts/fork-upstream-sync). Canonicalises four new fork-sync patterns: patterns/automated-upstream-cherry-pick-bot; draft PR on conflict withdo not merge+Conflictlabels + agit statuscomment tagging the original author (no pipeline stall); label-triggered backport viaBackport to: latest-x.0labels (author-owns-decision + silent-omission failure mode); patterns/weekly-reconciliation-check with two explicit audits (upstream-in-sync-with-OSS and latest-branches-consistent) posted to a dedicated GitHub issue for human triage (concepts/weekly-integrity-reconciliation). Reported track record "a year and six months" in production at time of writing. -
— Phani Raju (PlanetScale, July 2022) on how PlanetScale shipped per-password roles on top of Vitess's native table-ACL mechanism. The post is the canonical wiki disclosure of Vitess's static-JSON-config authz shape (
table_groupswithtable_names_or_prefixes+readers/writers/adminsuser lists, reloaded from disk via--table-acl-config-reload-interval) and of the table-group ACL concept it implements. Seventh distinct Vitess-internals axis on the wiki (after expression evaluation, data motion, query routing / transactional writes, load admission / throttler, fork management, and the new backup / observability tooling from Vitess 21). Architecturally the post is about side-stepping rather than extending the native mechanism: PlanetScale's managed-multi-tenant shape breaks the table-ACL's three working assumptions (pre-defined user set, small number of ACL files, maintenance-scheduled updates), so they hard-code three synthetic usernames into a universal ACL config shipped identically to everyvttablet, store per-customer passwords + roles in an external credential store, and rewrite the security principal on the fly at the user query frontend before the query reaches Vitess. The full shape is canonicalised as the patterns/external-credential-store-with-principal-rewrite pattern. First wiki datum on Vitess's authz subsystem. -
sources/2026-04-21-planetscale-behind-the-scenes-how-schema-reverts-work — eighth distinct Vitess-internals axis on the wiki (after expression evaluation, data motion, query routing / transactional writes, load admission / throttler, fork management, backup / observability tooling from Vitess 21, and authz). Guevara + Noach frame Vitess's VReplication-driven online DDL as architecturally distinctive on five design properties — copy-and-changelog progress both tracked (not just backfill); per-transaction GTID mapping; GTID-set-driven interleaving between copy and change-log phases; transactional sidecar-state coupling with the destination write; and crucially "Unlike any other schema change solution, Vitess does not terminate upon migration completion." The non-termination property is the load- bearing enabler for instant schema revert via inverse replication: after cut-over the shadow table and its VReplication stream stay alive, the stream is re-primed in the inverse direction, and the old-schema table becomes a hot inverse shadow so a revert is a second freeze-point swap of two already-in-sync tables, not a second data copy. New canonical concepts concepts/shadow-table, concepts/cutover-freeze-point, concepts/pre-staged-inverse-replication and patterns patterns/shadow-table-online-schema-change, patterns/instant-schema-revert-via-inverse-replication originate from this post. Canonicalises VReplication as the substrate for both online schema changes and their inverse (schema reverts), extending the existing 's canonicalisation of VReplication as the substrate for data-motion cutovers. The framing of revert as a "revolving door" mirrors the patterns/reverse-replication-for-rollback framing at the data-motion scale — one architectural principle ("keep the inverse replication alive past cut-over so nothing is a one-way door") at two scales.
-
— canonical wiki disclosure of Vitess's public CDC entrypoint: the VStream gRPC API exposed by every VTGate. Matt Lord (Vitess core maintainer) walks through the two-RPC layering — tablet-level VStream used by VReplication internally, VTGate-level VStream fanned across all shards of a keyspace for external CDC drivers — and shows via a two-shard
customerkeyspace worked example the full event vocabulary (FIELD,ROW, VGTID,BEGIN,COMMIT,COPY_COMPLETED) emitted by a running VStream. First canonical wiki framing of Vitess as the sharding layer that owns the change stream — unified keyspace-wide change stream where VGTID is the one progress token the consumer persists. The post names four driver ecosystems composing on the VStream API (Debezium, Airbyte source, Fivetran source, PlanetScale Connect) — canonical wiki instance of the new patterns/cdc-driver-ecosystem pattern. Closing guidance canonicalised: "use a Vitess variant of the connector/driver rather than the MySQL one" — engine-native CDC tooling is single-shard-blind on sharded Vitess clusters. Sixth canonical Vitess- internals disclosure on the wiki after evalengine, VReplication / VDiff / MoveTables, Consistent Lookup Vindex, Throttler, and Vitess 21 release notes — this one fills in the CDC / data-pipeline axis of the public API surface. -
— Brian Morrison II (PlanetScale, 2022-10-21) publishes the pedagogy-101 on-ramp for Vitess: the resiliency / scalability / performance buzzword triad mapped onto the three load-bearing primitives. Resiliency = "running multiple instances of MySQL (on one or more servers) and uses a lightweight proxy, known as VTGate, to intelligently route queries to the proper MySQL instance" with automatic failover ("Vitess can also automatically detect when a MySQL instance goes offline and determine the best candidate to take its place as the primary MySQL process to serve queries for a given table"). Scalability = transparent horizontal sharding ("It can split tables up across multiple MySQL instances to balance the load across multiple servers. When a query is received by the VTGate, the system will automatically determine which MySQL instances a row or set of rows lives in, will adjust the query to simultaneously grab the rows from these instances, and return the data just as if you were querying data from a single database. All of this is completely transparent to the developer — and perhaps more importantly, the user!"). Performance = two-tier connection-pool architecture ("Vitess takes the lightweight connections established by each client to VTGate and maps them to a smaller pool of MySQL connections managed by VTTablet. This process in turn helps to avoid overloading the individual MySQL processes, resulting in lower resource utilization since only VTTablet needs to connect to the underlying MySQL process"). Names the Go + gRPC implementation substrate ("The various Vitess components are written with Go and internally communicate with one another over gRPC. With the concurrency features built into the Go language, Vitess is able to easily handle thousands of clients simultaneously") and the 2010-YouTube origin + contributor roster ("contributions from PlanetScale, Google, GitHub, Slack, Square, Stripe, and several more data-heavy companies"). Beginner-audience scope disposition: no architecture diagrams, no production numbers, no benchmarks; canonical first-principles statement that the quantitative sibling posts measure and elaborate (1M-connections, connection-pooling-in-vitess, scaling-hundreds-of-thousands-K8s). Brian Morrison II's earliest-by-date wiki-represented Vitess piece; predates his 2023-11-20 sharding-benefits and 2024-03-19 UUID-PK deep-dives. Scatter-gather costs completely elided, failover mechanism hand-waved, VSchema/VIndex absent, "thousands of clients" claim is three orders of magnitude below the 1M ceiling canonicalised elsewhere — this is the on-ramp, not the architecture disclosure.
- — Canonical shortest-form framing of Vitess as the tier-1 clustering substrate in PlanetScale's data-safety envelope. Sam Lambert (PlanetScale CEO, 2023-06-28) opens the data-safety post with "Whenever you create a database on PlanetScale you are actually creating a complete Vitess cluster. Vitess is an open-source database clustering system that enhances the scalability and manageability of MySQL." Adds one new operational datum: "In the time it takes you to read this blog post, Vitess clusters will have served 10s of millions of users and 100s of millions of queries across 100s of petabytes of data" — 100s of petabytes aggregate Vitess- managed data across Slack / Hubspot / Etsy as of 2023-06-28, the largest aggregate-scale datum on the Vitess page. Names Slack, Hubspot, Etsy as "primary datastore" users — i.e., load-bearing for those companies, not peripheral caches. Pairs with concepts/storage-engine-maturity-as-data-risk framing later in the same post: "Vitess, which has served some of the largest sites on the planet for over a decade" — the 10+-years-of-production argument for trusting Vitess's code paths.
Related¶
- systems/mysql
- systems/planetscale
- systems/planetscale-for-postgres
- systems/neki
- systems/postgresql
- systems/innodb
- systems/spfresh
- systems/vitess-evalengine
- systems/vitess-vreplication
- systems/vitess-vdiff
- systems/vitess-movetables
- systems/vitess-mysqlshell-backup
- systems/vitess-vexplain
- systems/kubernetes
- patterns/vector-index-inside-storage-engine
- patterns/callback-slice-vm-go
- patterns/static-type-specialized-bytecode
- patterns/vm-ast-dual-interpreter-fallback
- patterns/fuzz-ast-vs-vm-oracle
- patterns/architect-sharding-from-first-principles-per-engine
- patterns/snapshot-plus-catchup-replication
- patterns/vdiff-verify-before-cutover
- patterns/routing-rule-swap-cutover
- patterns/reverse-replication-for-rollback
- patterns/read-replica-as-migration-source
- patterns/ordered-commit-without-2pc
- patterns/singular-vs-distributed-throttler
- patterns/host-agent-metrics-api
- patterns/throttler-per-shard-hierarchy
- patterns/idle-state-throttler-hibernation
- concepts/bytecode-virtual-machine
- concepts/consistent-non-locking-snapshot
- concepts/gtid-position
- concepts/binlog-replication
- concepts/query-buffering-cutover
- concepts/reverse-replication-workflow
- concepts/schema-routing-rules
- concepts/fault-tolerant-long-running-workflow
- concepts/online-database-import
- concepts/vindex
- concepts/consistent-lookup-vindex
- concepts/keyspace-id
- concepts/orphan-lookup-row
- concepts/throttler-fail-open-vs-fail-closed
- concepts/throttler-metric-scope
- concepts/throttler-hibernation
- concepts/replication-heartbeat
- concepts/metric-staleness-from-polling-layers
- concepts/fork-upstream-sync
- concepts/conflict-resolution-memoization
- concepts/weekly-integrity-reconciliation
- concepts/shadow-table
- concepts/cutover-freeze-point
- concepts/pre-staged-inverse-replication
- concepts/online-ddl
- systems/vitess-cherry-pick-bot
- systems/git-replay
- systems/github-actions
- systems/git
- patterns/automated-upstream-cherry-pick-bot
- patterns/draft-pr-for-conflicts
- patterns/label-triggered-backport
- patterns/stateful-github-actions-cron
- patterns/weekly-reconciliation-check
- patterns/shadow-table-online-schema-change
- patterns/instant-schema-revert-via-inverse-replication
- companies/cloudflare
- systems/vitess-table-acl
- systems/planetscale-user-query-frontend
- concepts/table-group-acl
- concepts/on-the-fly-security-principal-rewrite
- concepts/external-credential-store
- patterns/external-credential-store-with-principal-rewrite
- systems/vitess-vstream
- systems/planetscale-connect
- systems/debezium
- concepts/vgtid
- concepts/unified-change-stream-across-shards
- concepts/change-data-capture
- concepts/oltp-vs-olap
- patterns/cdc-driver-ecosystem
- concepts/foreign-key-constraint
- concepts/innodb-silent-cascade-in-binlog
- concepts/vitess-foreign-key-enforcement
- concepts/innodb-internal-operations-table
- concepts/innodb-parent-table-rename-pinning
- concepts/cyclic-foreign-key-prohibition
- patterns/application-level-cascade-orchestration
- patterns/nowait-lock-for-cascade-select
- systems/planetscale-mysql-server-fork
Seen in: consensus commit-path framing¶
- sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-6-completing-requests — Sugu Sougoumarane presents the two-phase commit-path protocol (tentative → durable → complete) from the perspective of a Vitess-class operational system. The framework motivates why Vitess-on-MySQL deployments manage semi-sync split-brain operationally: MySQL semi-sync lacks the two-phase shape, so Vitess uses reparenting (PRS / ERS), vttablet lameduck drain, and vtgate query buffering to bound the hazard window. Cross-instalment payoff: the lock-based election + leader lease recommended in part 5 composes with early-ack on durability from part 6 to give both cheap writes and cheap leader-local reads.
Related (consensus series)¶
- concepts/two-phase-completion-protocol / concepts/tentative-request / concepts/durable-request / concepts/request-cancellation — the commit-path vocabulary.
- patterns/two-phase-tentative-then-complete / patterns/early-ack-on-durability / patterns/skip-completion-for-late-followers — the pattern family.
- concepts/mysql-semi-sync-split-brain — the hazard Vitess manages around.
- concepts/quorum-read — the read-path alternative Vitess sidesteps via the lease-backed stable leader.
Seen in: Slack Unified Grid re-architecture precondition¶
- sources/2024-08-26-slack-unified-grid-how-we-re-architected-slack-for-our-largest-customers
— first wiki-canonical non-PlanetScale Vitess-user disclosure.
Slack's 2024-08-26 Unified Grid retrospective names the
Vitess migration as one of two architectural preconditions
that made the workspace→org-wide re-architecture tractable:
"With the Vitess migration, we began sharding data along
axes other than workspace or org ID, meaning that the
workspace or org was no longer required to route queries to
the appropriate database shard for our most important tables."
The
messagestable is the disclosed example — sharded by channel ID rather than workspace ID, so message queries no longer needed the workspace-routing context the session tokens carried. Canonical substrate-precondition framing for concepts/workspace-scoped-to-org-wide-migration: the re-axis of session-scope is cheaper to do after the hot tables have been re-sharded off the old axis. See also the prior Slack engineering post Scaling Datastores at Slack with Vitess cited in the Unified Grid post — candidate for later standalone wiki ingest with mechanism depth on the shard-key decisions for other tables.
Seen in: production-user scale datapoints¶
- — Lucy Burns + Taylor Barnett (PlanetScale, 2023-08-31) triples the wiki's Vitess production-user datapoint ledger with three named non-PlanetScale deployments at webscale: (1) JD.com — 35 million QPS during Singles Day, the largest Vitess production number disclosed in wiki sources to date; (2) Slack — full Vitess migration before the WFH 2020 traffic influx, canonicalising the pandemic-traffic stress test as a Vitess-resilience milestone (predecessor to the shard-key re-axis documented in the Unified Grid post); (3) Square — Cash app on Vitess, adding regulated-payments workloads to the Vitess production taxonomy. Hubspot is named without detail as a fourth Vitess user. Also canonicalises 2010 as the Vitess origin year at YouTube — the missing date for this page's YouTube-origin framing. And names the PlanetScale positioning verbatim: "running Vitess at scale still requires a whole engineering team with the right experience. Not all organizations have the depth that Slack and Square do" — Slack and Square as the reference-class organisations that do have the depth to self-host Vitess, i.e. the cutoff against which the managed PlanetScale offering is positioned. MyFitnessPal is quoted saying "We wanted PlanetScale and Vitess to bring to MyFitnessPal what Kubernetes brought to application delivery and deployment. Databases are hard. We would rather PlanetScale manage them" — the Kubernetes-for-databases analogy framed by a customer.
- Also canonicalises the pre-Vitess-era historical framing via patterns/application-level-sharding: Pinterest (2012, Marty Weiner, "We had several NoSQL technologies, all of which eventually broke catastrophically") and Etsy (two-way primary-key → shard_id lookup, shard-packing onto hosts) as the two canonical application-level-sharding precedents. Both later migrated some workloads to Vitess — the self-correction from within the reference-class precedents, establishing application-level sharding as a legacy pattern relative to the substrate-layer alternative Vitess provides.
Seen in: 1M QPS benchmark demonstration¶
- — Jonah Berquist (PlanetScale, 2022-09-01) canonicalises
horizontal sharding's linear QPS scaling on a Vitess-on-
MySQL cluster with three disclosed datapoints against a
Percona
sysbench-tpccworkload: 16 shards = 420k QPS, 32 shards = 840k QPS, 40 shards = 1M+ QPS sustained over 5 minutes. The 16 → 32 step is a clean 2× on both axes — canonical wiki evidence for concepts/linear-shard-count-throughput-scaling. The 40-shard datum also demonstrates that Vitess shard counts are not restricted to powers of 2 ("while we like powers of 2, this isn't a limitation, and we can use other shard counts") — the configuration was sized specifically to hit the 1M-QPS goal (32 * 1M / 840k ≈ 38, rounded up to 40).
Within a single configuration, Berquist names the saturation signal canonically: for the 16-shard run, "the QPS increase was greater between 1024 threads and 2048 threads than it was between 2048 threads and 4096 threads" + VTGate p99 "spiking toward the end" while QPS was still climbing. This is the first wiki instance of the concepts/latency-rises-before-throughput-ceiling diagnostic — p99 rises faster than p50 while QPS is still growing, firing the "add shards" signal earlier than waiting for the absolute QPS plateau. VTGate's role in the benchmark: the source of the p50 + p99 latency measurements, because its routing-only position in the data path gives the canonical server-side latency view.
Caveats Berquist discloses: single-tenant enterprise-
sized deployment with non-default query + transaction
timeout tuning — the 1M-QPS figure is a substrate-
capability demonstration, not a shared-tenant baseline.
The benchmark doesn't disclose underlying per-shard MySQL
instance sizing or sysbench-tpcc transaction mix; it's
positioned as "not a rigorous academic benchmark" with a
promise of a later academic-partnership publication. Pairs
with (ceiling) and
(connection-count
axis) as the three bracket points for Vitess-at-scale
numbers in the wiki.
Seen in: TAOBench social-graph benchmark¶
- — Liz van Dijk (PlanetScale, 2022-09-08) canonicalises
a social-graph-shaped benchmark complement to the
sysbench-tpcc1M-QPS post. TAOBench — Audrey Cheng (UC Berkeley) + Meta engineers' VLDB 2022 benchmark — synthesises Meta's production TAO workload into a runnableobjects edgesschema (concepts/social-graph-objects-and-edges) that any relational DB can be benchmarked against. The PlanetScale run uses a 48-CPU-core resource cap (set by Cheng's team), allocated as 44 cores to the Vitess query path (VTGate + VTTablet + MySQL) and 4 cores to multi-tenant serverless overhead (edge load balancers) — canonicalises concepts/constrained-resource-benchmark and patterns/constrained-resource-benchmark-for-shared-tenant-capability-disclosure.
Load-bearing architectural framing: the benchmark's
objects + edges schema explicitly stresses
hot-row and
thundering-herd behaviour
— distinct from sysbench-tpcc's shard-key-aligned access
pattern, which has no hot rows by construction. TAOBench
is the first wiki-canonicalised benchmark that measures
substrate behaviour under viral-content skew by design.
PlanetScale's published takeaway is explicitly
graceful saturation — "sustained stability of
PlanetScale clusters under even the most extreme resource
pressure" — not peak throughput. Pairs with the
1M-QPS sysbench-tpcc post as the two complementary
axes of PlanetScale's 2022 Vitess benchmarking:
OLTP-shaped shard-linear scaling + social-graph-shaped
substrate-maturity-at-the-ceiling.
Seen in: Gen4 query-planner architectural disclosure¶
-
— Andrés Taylor (PlanetScale / Vitess core, 2023-06-01) canonicalises the architectural rewrite of the VTGate query planner from its old monolithic "plan-the-whole-horizon-in-one-go" model to the new step-by-step runnable-plan pipeline. Three new wiki primitives are canonicalised here:
-
Runnable-plan-at-every-step as an invariant — "every step in the optimization pipeline results in a runnable plan". Replaces the old model's opaque call-stack intermediate state. Unlocks phase-by- phase plan inspection + differential testing of the planner itself (run unoptimised plan + optimised plan, compare results, any divergence is a planner bug).
-
Horizon operator — a placeholder bundling
SELECT/ORDER BY/GROUP BY/LIMIT/ aggregations that the planner tries to push wholesale to MySQL before expanding into constituentOrdering/Projection/Aggregator/Filter/Limitoperators. Makes the "can we push this entire post-FROM fragment to one shard?" decision structurally explicit + preserves the runnable-plan invariant for pre-expansion plans. -
Offset Planning — a new pipeline stage between horizon planning and executable-plan emission, resolving symbolic column references to positional row offsets for the execution engine.
The full new-model pipeline: Parse → Determining Join Order →
Horizon Planning (fixed-point-looped) → Offset Planning →
Executable Plan. This 2023-06-01 post is the pre-phase-
ordering architectural disclosure — the 2024-07-22 sibling
post about phase ordering
builds its phase discipline inside horizon planning on top
of the pipeline this post establishes. Canonical wiki source
for patterns/runnable-plan-pipeline. Worked example
(SELECT u.foo, ue.bar FROM user u JOIN user_extra ue ON
u.uid = ue.uid ORDER BY u.baz) walks through four plan-tree
snapshots ending with a
nested-loop join — LHS query
SELECT u.foo, u.uid, u.baz, weight_string(u.baz) FROM user
AS u ORDER BY u.baz ASC, RHS query SELECT ue.bar FROM
user_extra AS ue WHERE ue.uid = :u_uid bound per row.
Expressiveness capability-unlock: the new planner supports
arbitrary expressions in ORDER BY / GROUP BY /
aggregations, evaluated via
evalengine at VTGate when MySQL can't push them down.
-
— 2021 feature launch: Vitess as the query-telemetry capture layer. David Graham announces that all PlanetScale branches track per-query execution statistics (count, rows returned, duration) with zero overhead on the MySQL server, because Vitess sits on the query path and can observe every query without needing server-side
performance_schemawork. "By using Vitess, which powers PlanetScale databases, to track how many times a query runs, how many rows it returned, and how long each query takes to complete, we provide a complete view of query traffic on the database." This is the foundational primitive underneath what later became PlanetScale Insights: query-digest fingerprinting, SQLCommenter-tagged observability, AI-powered index suggestions, and Traffic Control's resource budgets all build on this Vitess-layer telemetry. Canonical for query-statistics telemetry and the 100 ms slow-query threshold. The post doubles as a worked case study: Vitess's query-statistics feature surfaced a 719 ms backup-deletion query inside PlanetScale's own fleet, which was resolved by reordering the columns in a composite index (see concepts/composite-index-column-order) — dropping the query to under 20 ms (98 % reduction). -
— canonical wiki disclosure of a production Temporal-on- Vitess VSchema and the first wiki source naming
xxhashas the concrete Primary Vindex function in a production Vitess configuration ("xxhash": {"type": "xxhash"}). Savannah Longoria (PlanetScale, 2022-12-14) walks the full VSchema shape for a two-keyspace split (13 sharded + 14 unsharded Temporal tables) and the migration path from one-unsharded- keyspace viaMoveTables+SwitchTraffic— canonical production instance of Vitess's unsharded-to- sharded migration on a write-intensive Temporal workload. Confirms that Vitess's shard count stays mutable (viaReshard) even when composed with Temporal's immutable-numHistoryShardsconstraint — the two sharding knobs are independent. First wiki disclosure that PlanetScale uses Temporal internally to automate Vitess release workflows. Empirical anchor: one customer's Temporal databases sustained 40k–200k QPS fluctuation across Black Friday / Cyber Monday 2022 with peaks to 180k, "no interruptions".
Seen in — Foreign-key constraint support (~1-year engineering retrospective)¶
- — thirteenth canonical Vitess-internals disclosure on the
wiki. Shlomi Noach + Manan Gupta (2023-12-05) retrospective
on the ~1-year engineering investment to ship
foreign-key-constraint
support in Vitess while preserving Online
DDL, gated deployments, and online imports. Canonicalises:
(a) InnoDB's
silent-cascade-in-binlog property as the load-bearing
architectural constraint that forced Vitess to own FK logic
above the storage engine; (b)
Vitess's
application-level FK enforcement as the new subsystem —
VTGate plans cascade operations explicitly with
SELECT-
FOR-UPDATE parent → DELETE child first → DELETE parent
recursion, so every child-side cascade lands in the binlog;
(c)
ON UPDATE CASCADErequiresFOREIGN_KEY_CHECKS=0+ Vitess-side validation of grandchildRESTRICTs the silence masks; (d) two MySQL-fork patches —rename_table_preserve_foreign_key+ internal operations tables — bound the shadow-table / FK interaction surface so shadow-table Online DDL works on FK-carrying tables; (e) VReplication Database Imports forks snapshot+tail sequencing (single-bulk-snapshot + tail, PITR-style) for FK tables with cascading actions because the external source's binlog has the InnoDB cascade hole; (f)schemadiffbecomes FK-aware — the deploy-path computation understands FK ordering constraints (create parent before child; index parent before adding FK on child) and migration concurrency blocks; (g) reverts with FK schemas allowed-with-orphan-row-warning rather than forbidden; (h) a new [[patterns/mysql-compatible- differential-fuzzing|differential fuzzer]] altitude — this one targets FK-DML semantics, pitting Vitess against standalone MySQL under random queries on a 20-table schema with mixed FK relationships, with an e2e sidecar that strips FKs from replicas to use the cascade-in-binlog gap as a drift oracle (FK-stripped replicas drift iff Vitess lets InnoDB do any cascading). Four representative fuzzer bugs walked in the post (no-op CASCADE+RESTRICT validation;-0 FLOATtype coercion; missing gap-locks onREPLACE INTO-triggered cascade-select fixed viaNOWAIT; cyclic-FK prohibition). Current scope: single-shard only as of 2023-12-05, with shard-scoped-in-multi-shard and cross-shard on the roadmap — the Vitess-owns-FK architectural choice is partially motivated by the cross-shard future. "Non-zero performance impact" framing; no specific numbers disclosed.