CONCEPT
VTGate query planner¶
Definition¶
The VTGate query planner is the component inside Vitess's proxy tier (vtgate) that takes an incoming MySQL-protocol SQL statement and produces an execution plan — a tree of operators, some of which are pushed down to one or more backing MySQL shards (via Route operators) and some of which execute at VTGate (joins, projections, aggregations, sorts, filters that can't be pushed down).
The planner's core job is twofold:
- Push down as much as possible to MySQL, because MySQL is faster than VTGate at every relational operator MySQL supports. "We want to delegate as much work as possible to MySQL — it is much faster than Vitess at doing all the database operations, including grouping and aggregation" (Source: ).
- Execute at VTGate what cannot be pushed down — cross-shard joins, global aggregations, cross-keyspace operations — using VTGate's own operator primitives.
This is the architectural inversion from a "connection pooler / query router" framing of VTGate to a distributed query engine framing.
Operator taxonomy¶
The planner emits plan trees built from a fixed operator vocabulary:
- Route — the "push to MySQL" primitive. Everything under a
Routein the plan tree is sent as a single query to one or more MySQL shards. The route's shape is determined by Vindex resolution: - Single-shard route (shard key predicate fully resolves to one keyspace_id).
- Scatter route (scatter-gather — predicate doesn't include shard key).
- Multi-shard route (predicate resolves to a subset of shards via a lookup Vindex).
- NestedLoopJoin — the iterative-lookup join primitive. For each row from the LHS sub-plan, issue the RHS sub-plan bound to the join key.
- Project — compute arithmetic / scalar expressions on top of a sub-plan's rows, using evalengine for expression evaluation.
- OrderedAggregate / HashAggregate — VTGate-side group-by + aggregation, for the "global" step of local-global aggregation.
- Sort / MemorySort — VTGate-side ordering when MySQL's per-shard ordering isn't sufficient.
- Filter — VTGate-side row filtering (e.g.
HAVINGclauses on aggregated results). - Limit — VTGate-side row limiting (when applied across scatter results).
The Gen4 planner¶
Vitess's current planner is Gen4, a complete rewrite that replaced the earlier v3 planner. Gen4 was introduced around Vitess 10-11 and has been incrementally stabilised across subsequent releases (Vitess 21 release notes — Source: ). Gen4 supports:
- More SQL surface area (recursive CTEs — experimental in Vitess 21).
- Better cross-shard join planning (including the push-aggregation-under-join rewrite from Galindo-Legaria & Joshi, shipped as vitessio/vitess #9643).
- evalengine integration for constant folding, type-aware expression evaluation, and the Project operator's arithmetic.
The pushdown discipline¶
Every planner decision is a yes/no on push-down:
- Shard key predicate present → push down fully; Route to one shard.
- No shard key predicate, but operation is per-row (projection, filter, scalar arithmetic) → push down the scatter; let each shard do the per-row work in parallel.
- Operation requires coordination across shards (join across non-colocated shards, global aggregation, cross-shard sort) → split the operation; push down what's decomposable (local aggregation, per-shard sort); execute the cross-shard coordination step at VTGate.
- Operation requires information VTGate has but MySQL doesn't (Vindex routing decisions, cross-keyspace joins) → execute at VTGate.
Key planner rewrites¶
- Local-global aggregation decomposition — split
COUNT/SUM/MIN/MAX/AVGinto per-shard local + VTGate global. - Push aggregation under join — when
GROUP BY+ aggregate is over a cross-shard join, pre-aggregate each side and carry a multiplier column to preserve algebraic equivalence. - Consistent Lookup Vindex rewrites — for DMLs on tables with secondary Vindexes, emit the three-connection
Pre/Main/Postprotocol (Source: ). - Predicate pushdown — move
WHEREclauses as far down the plan tree as possible. - Projection pushdown — select only the columns each sub-plan needs; avoid shipping columns to VTGate that'll be projected away.
Relation to MySQL's own planner¶
VTGate's planner and MySQL's planner operate at different levels:
- MySQL's planner optimises within a single shard's data — index selection, join order,
WHEREclause evaluation order. - VTGate's planner optimises across shards — which operators go where, how to split aggregations, whether to push aggregation under a join.
A query sent to VTGate is planned twice: once by VTGate (to decide the cross-shard plan tree) and once by MySQL (to decide the per-shard execution of each Route's query). The two planners are decoupled and don't share statistics — VTGate presumably uses schema information + heuristic rules, while MySQL uses its cost-based optimiser on shard-local statistics.
Why cross-shard query planning is harder than single-node¶
- Cost model is non-uniform: local operations are microseconds; cross-shard network hops are milliseconds; the ratio is ~1000×. Planner decisions about where to execute operators dwarf local-optimiser decisions.
- Statistics are fragmented: each shard has its own statistics; no global statistics exist without additional machinery.
- Data movement dominates: planning is often about minimising data shipped between shards and VTGate, not about CPU minimisation.
- Correctness under pushdown is non-trivial: as the aggregation-pushdown case shows, naive pushdown produces wrong answers; each rewrite requires algebraic justification.
Seen in¶
- — canonical wiki introduction. Andres Taylor (2022-06-24) canonicalises the planner's role in splitting aggregation across shards, its pushdown discipline ("the planner tries pushing as much work down to MySQL as possible"), and the concrete aggregation-pushdown-under-join rewrite.
- — canonicalises the planner's role in emitting the three-connection
Pre/Main/Postprotocol for DMLs on tables with secondary Vindexes. - — Vitess 21 extends the planner with experimental recursive CTE support + deeper atomic distributed transaction integration.
- — canonicalises the evalengine that the planner delegates expression evaluation to for Project / Filter / Having operators at VTGate.
-
— canonical wiki disclosure of the planner's phase structure. Andres Taylor (2024-07-22) canonicalises planner phase ordering + fixed-point tree rewriting as the load-bearing planner architecture: "We have several phases that run sequentially. After completing a phase, we run the push-down rewriters, then move to the next phase, and so on." Names two phases explicitly — initial + split aggregation. Bug retrospective: OOM in VTGate caused by "push ordering under aggregation" firing in the initial phase, wedging
OrderingbetweenAggregatorandApplyJoin, blocking aggregation pushdown under join. Fix: gate the ordering rewriter behind the split-aggregation phase via patterns/phase-gated-planner-rewriter. Shipped as vitessio/vitess #16278. Sequel to the 2022-06-24 aggregation post — same mechanism, two years of production experience later. -
— canonical wiki disclosure of planner-level random-query differential fuzzing. Arvind Murty's 2023 summer-intern retrospective (published 2024-04-09) under Andrés Taylor's guidance canonicalises the planner as a correctness-probing target, not just an optimisation target. Fuzzer shape: bespoke random-query generator on
EMP/DEPTtables sharded onEMPNO/DEPTNO; generated query run against both Vitess and MySQL; byte-for-byte comparison; any divergence is a planner bug (or rarely a MySQL bug). Explicitly ruled out SQLancer on two grounds: logic-bug oracle is wrong substrate for MySQL-compat engines ("Vitess ideally should perfectly mimic MySQL, quirks included") and VSchema not modelled. Canonicalises patterns/mysql-compatible-differential-fuzzing at planner altitude (distinct from the expression-altitude patterns/fuzz-ast-vs-vm-oracle fuzzer inside evalengine). Minimal fuzzer-output queries are produced by Andrés Taylor's AST-based query simplifier — a brute-force delta-debugging loop Murty extended for end-to-end tests via VSchema threading (vitessio/vitess #13636). Planner fuzzer shipped as vitessio/vitess #13260. Together with the evalengine fuzzer, this post canonicalises Vitess's two-altitude fuzzing posture on the wiki: expression fuzzer tests scalar-SQL semantics; planner fuzzer tests whole-query planning + execution. -
— canonical wiki disclosure of the planner's architectural rewrite ("old model" vs "new model"). Andrés Taylor (2023-06-01) canonicalises three load-bearing new-model properties: (1) the runnable-plan-at-every-step invariant — "every step in the optimization pipeline results in a runnable plan" — which replaces the old model's call-stack-resident intermediate state; (2) the Horizon operator as a placeholder bundling
SELECT/ORDER BY/GROUP BY/LIMIT/ aggregations that the planner tries to push wholesale to MySQL before expanding; (3) Offset Planning as a new pipeline stage between horizon planning and executable-plan emission. Also canonicalises the tree-transformation arity ("each step is also simpler — it's a tree transformation taking two operators as input") and the arbitrary-expression capability unlock inORDER BY/GROUP BY/ aggregations enabled by evalengine integration. This is the pre-phase-ordering architectural disclosure — the 2024-07-22 post about phase ordering builds on top of the pipeline this 2023-06-01 post describes. Worked example:SELECT u.foo, ue.bar FROM user u JOIN user_extra ue ON u.uid = ue.uid ORDER BY u.bazthrough four progressive plan-tree snapshots, ending with a nested-loop join emittingSELECT u.foo, u.uid, u.baz, weight_string(u.baz) FROM user AS u ORDER BY u.baz ASCon the LHS + a per-rowSELECT ue.bar FROM user_extra AS ue WHERE ue.uid = :u_uidon the RHS. Canonical source for patterns/runnable-plan-pipeline.
Related¶
- concepts/local-global-aggregation-decomposition
- concepts/push-aggregation-under-join
- concepts/cross-shard-query
- concepts/scatter-gather-query
- concepts/vindex
- concepts/keyspace-id
- concepts/nested-loop-join
- systems/vitess
- systems/vitess-evalengine
- systems/mysql
- concepts/planner-phase-ordering
- concepts/fixed-point-tree-rewriting
- concepts/query-simplifier
- concepts/sqlancer-logic-bug
- patterns/phase-gated-planner-rewriter
- patterns/mysql-compatible-differential-fuzzing
- patterns/runnable-plan-pipeline
- concepts/horizon-operator
- concepts/runnable-plan-at-every-step
- concepts/offset-planning
- systems/sqlancer