CONCEPT Cited by 4 sources
VTGate query planner¶
Definition¶
The VTGate query planner is the component inside Vitess's proxy tier (vtgate) that takes an incoming MySQL-protocol SQL statement and produces an execution plan — a tree of operators, some of which are pushed down to one or more backing MySQL shards (via Route operators) and some of which execute at VTGate (joins, projections, aggregations, sorts, filters that can't be pushed down).
The planner's core job is twofold:
- Push down as much as possible to MySQL, because MySQL is faster than VTGate at every relational operator MySQL supports. "We want to delegate as much work as possible to MySQL — it is much faster than Vitess at doing all the database operations, including grouping and aggregation" (Source: sources/2026-04-21-planetscale-grouping-and-aggregations-on-vitess).
- Execute at VTGate what cannot be pushed down — cross-shard joins, global aggregations, cross-keyspace operations — using VTGate's own operator primitives.
This is the architectural inversion from a "connection pooler / query router" framing of VTGate to a distributed query engine framing.
Operator taxonomy¶
The planner emits plan trees built from a fixed operator vocabulary:
- Route — the "push to MySQL" primitive. Everything under a
Routein the plan tree is sent as a single query to one or more MySQL shards. The route's shape is determined by Vindex resolution: - Single-shard route (shard key predicate fully resolves to one keyspace_id).
- Scatter route (scatter-gather — predicate doesn't include shard key).
- Multi-shard route (predicate resolves to a subset of shards via a lookup Vindex).
- NestedLoopJoin — the iterative-lookup join primitive. For each row from the LHS sub-plan, issue the RHS sub-plan bound to the join key.
- Project — compute arithmetic / scalar expressions on top of a sub-plan's rows, using evalengine for expression evaluation.
- OrderedAggregate / HashAggregate — VTGate-side group-by + aggregation, for the "global" step of local-global aggregation.
- Sort / MemorySort — VTGate-side ordering when MySQL's per-shard ordering isn't sufficient.
- Filter — VTGate-side row filtering (e.g.
HAVINGclauses on aggregated results). - Limit — VTGate-side row limiting (when applied across scatter results).
The Gen4 planner¶
Vitess's current planner is Gen4, a complete rewrite that replaced the earlier v3 planner. Gen4 was introduced around Vitess 10-11 and has been incrementally stabilised across subsequent releases (Vitess 21 release notes — Source: sources/2026-04-21-planetscale-announcing-vitess-21). Gen4 supports:
- More SQL surface area (recursive CTEs — experimental in Vitess 21).
- Better cross-shard join planning (including the push-aggregation-under-join rewrite from Galindo-Legaria & Joshi, shipped as vitessio/vitess #9643).
- evalengine integration for constant folding, type-aware expression evaluation, and the Project operator's arithmetic.
The pushdown discipline¶
Every planner decision is a yes/no on push-down:
- Shard key predicate present → push down fully; Route to one shard.
- No shard key predicate, but operation is per-row (projection, filter, scalar arithmetic) → push down the scatter; let each shard do the per-row work in parallel.
- Operation requires coordination across shards (join across non-colocated shards, global aggregation, cross-shard sort) → split the operation; push down what's decomposable (local aggregation, per-shard sort); execute the cross-shard coordination step at VTGate.
- Operation requires information VTGate has but MySQL doesn't (Vindex routing decisions, cross-keyspace joins) → execute at VTGate.
Key planner rewrites¶
- Local-global aggregation decomposition — split
COUNT/SUM/MIN/MAX/AVGinto per-shard local + VTGate global. - Push aggregation under join — when
GROUP BY+ aggregate is over a cross-shard join, pre-aggregate each side and carry a multiplier column to preserve algebraic equivalence. - Consistent Lookup Vindex rewrites — for DMLs on tables with secondary Vindexes, emit the three-connection
Pre/Main/Postprotocol (Source: sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex). - Predicate pushdown — move
WHEREclauses as far down the plan tree as possible. - Projection pushdown — select only the columns each sub-plan needs; avoid shipping columns to VTGate that'll be projected away.
Relation to MySQL's own planner¶
VTGate's planner and MySQL's planner operate at different levels:
- MySQL's planner optimises within a single shard's data — index selection, join order,
WHEREclause evaluation order. - VTGate's planner optimises across shards — which operators go where, how to split aggregations, whether to push aggregation under a join.
A query sent to VTGate is planned twice: once by VTGate (to decide the cross-shard plan tree) and once by MySQL (to decide the per-shard execution of each Route's query). The two planners are decoupled and don't share statistics — VTGate presumably uses schema information + heuristic rules, while MySQL uses its cost-based optimiser on shard-local statistics.
Why cross-shard query planning is harder than single-node¶
- Cost model is non-uniform: local operations are microseconds; cross-shard network hops are milliseconds; the ratio is ~1000×. Planner decisions about where to execute operators dwarf local-optimiser decisions.
- Statistics are fragmented: each shard has its own statistics; no global statistics exist without additional machinery.
- Data movement dominates: planning is often about minimising data shipped between shards and VTGate, not about CPU minimisation.
- Correctness under pushdown is non-trivial: as the aggregation-pushdown case shows, naive pushdown produces wrong answers; each rewrite requires algebraic justification.
Seen in¶
- sources/2026-04-21-planetscale-grouping-and-aggregations-on-vitess — canonical wiki introduction. Andres Taylor (2022-06-24) canonicalises the planner's role in splitting aggregation across shards, its pushdown discipline ("the planner tries pushing as much work down to MySQL as possible"), and the concrete aggregation-pushdown-under-join rewrite.
- sources/2026-04-21-planetscale-achieving-data-consistency-with-the-consistent-lookup-vindex — canonicalises the planner's role in emitting the three-connection
Pre/Main/Postprotocol for DMLs on tables with secondary Vindexes. - sources/2026-04-21-planetscale-announcing-vitess-21 — Vitess 21 extends the planner with experimental recursive CTE support + deeper atomic distributed transaction integration.
- sources/2025-04-05-planetscale-faster-interpreters-in-go-catching-up-with-cpp — canonicalises the evalengine that the planner delegates expression evaluation to for Project / Filter / Having operators at VTGate.