PATTERN Cited by 1 source
Runnable-plan pipeline¶
Intent¶
Design a query-planner optimisation pipeline such that every pipeline stage — and every rewriter step within a stage — produces an executable plan. No "opaque in-flight state" on the call stack or in temporary structs; no "we're halfway through a rewrite, come back later". The runnable-plan invariant is maintained from the first AST-to-operator conversion all the way to the final executable-plan emission.
Shipped in Vitess's Gen4 query planner; canonicalised in Andrés Taylor's 2023-06-01 post (Source: sources/2026-04-21-planetscale-optimizing-query-planning-in-vitess-a-step-by-step-approach).
Context¶
Query planners traditionally accrue complexity in two places:
- Monolithic passes that rewrite large subtrees in one go, with intermediate state in local variables / call stack / temporary structs. Hard to inspect, hard to test, hard to extend because each new optimisation must reason about where in the monolithic pass to insert.
- Ad-hoc rule interleaving where rewriters are ordered by "what worked for the last bug we had", with no systematic phase boundaries. Adding a rewriter risks changing behaviour of existing plans in non-obvious ways.
Both failure modes make the planner hostile to change: new SQL-feature support, new pushdown opportunities, new cost-model tweaks all risk regressing existing queries in ways that only surface in production.
Solution¶
Invariant: every rewriter has type Tree → Tree where Tree is the
full operator-tree representation. No rewriter has access to global state;
no rewriter leaves the tree in a non-runnable state partway through its
execution. The pipeline composes rewriters using two primitives:
- Fixed-point loop within a phase — apply the phase's rewriters to the tree repeatedly until a full pass produces no changes.
- Phase ordering across phases — divide rewriters into phases; run phases sequentially; each phase loops to a fixed point before the next begins.
Between phase boundaries (and between rewriter iterations within a phase), the intermediate plan is inspectable and executable. Taylor's verbatim articulation: "every step in the optimization pipeline results in a runnable plan. This means that developers working on the planner can inspect the plan at any stage, allowing for a better understanding of the optimization process at each step."
Additional ingredient — placeholder operators. To preserve the
runnable-plan invariant for deferred planning decisions, the pipeline
uses placeholder operators that encapsulate "work not yet decomposed"
but are themselves runnable. Canonical example: the
Horizon operator, which bundles SELECT +
ORDER BY + GROUP BY + LIMIT + aggregations into one node that can
either be pushed wholesale to MySQL or expanded into its constituent
operators. Both pre- and post-expansion trees are runnable.
Additional ingredient — local tree transformations. Each rewriter takes two adjacent operators (parent + child) as input and produces a subtree that replaces the pair. Taylor: "Each step is also simpler — it's a tree transformation taking two operators as input, one being the input of the other, and producing a new subtree that replaces the two inputs." This narrow arity makes rewriter composition tractable: each rewriter's precondition is local, its effect is local, and conflicts between rewriters manifest as two rewriters firing on overlapping subtrees — easy to detect and resolve via phase gating.
Structure¶
┌─ Parse ───────────┐
│ SQL → AST │
└──────┬────────────┘
│
▼
┌─ Join ordering ───┐
│ Enumerate + cost │ ← (externally costed; not a rewriter pass)
└──────┬────────────┘
│
▼
┌─ Horizon planning ◀─┐ ← each iteration: runnable plan
│ Rewriters: │
│ • push-Horizon │
│ • expand-Horizon │
│ • push-Projection │
│ • push-Ordering │
│ • split-agg │
│ • push-agg-join │ (fixed-point loop)
│ • ... │
└──────┬──────────────┘
│
▼
┌─ Offset planning ─┐ ← resolve symbolic refs → positional offsets
└──────┬────────────┘
│
▼
┌─ Executable plan ─┐
└───────────────────┘
Every arrow crosses a runnable-plan boundary.
Consequences¶
Benefits¶
- Inspectable phase-by-phase evolution. Developers can print the tree
at any boundary; the article shows this in action with four progressive
snapshots of a single query (
SELECT u.foo, ue.bar FROM user u JOIN user_extra ue ON u.uid = ue.uid ORDER BY u.baz). - Differential testing of the planner itself. Run the query on the unrewritten plan; run it on the rewritten plan; rows must match. Any divergence is a planner bug. Taylor: "the possibility of running both the unoptimized plan and the optimized version and comparing their results."
- Incremental feature addition. New SQL features / pushdown opportunities are added as new rewriters in the appropriate phase. Pipeline structure doesn't change.
- Expressiveness scales naturally. Because rewriters are local and
composable, arbitrary expressions in
ORDER BY/GROUP BY/ aggregations "just work" — the expression evaluator (evalengine) handles them at VTGate if MySQL can't. Taylor: "The new query planning model allows for arbitrary expressions to be used for ordering, grouping, and aggregations." - Rewriter-interference bugs are fixable with phase ordering, not
invasive restructuring. When two rewriters interfere (canonical case:
push-ordering-under-aggregation firing before split-aggregation wedges
OrderingbetweenAggregatorandApplyJoin, blocking join-side pushdown), the fix is to gate one rewriter behind a later phase — see patterns/phase-gated-planner-rewriter.
Costs¶
- Operator-tree type complexity. The tree must support every operator (including placeholders like Horizon), and serialisation / inspection must handle every case. Calcite, Catalyst, and Gen4 all pay this cost — hundreds of operator classes over time.
- Rewriter count growth. Narrow local-rewriter arity means N rewriters instead of 1 monolithic pass. For Vitess, this is dozens of rewriters; the phase list organises them but doesn't shrink them.
- Fixed-point convergence discipline. Each rewriter must be monotonic — it must make progress on some well-founded metric so the fixed-point loop terminates. Rewriters that can undo each other's work loop forever; authors must verify termination manually (see concepts/fixed-point-tree-rewriting).
- Debugging fixed-point non-termination is harder than debugging a straight-line pass; the symptom is a hang, not a wrong answer.
Known uses¶
- Vitess Gen4 planner (2023-present) — canonical wiki reference. Horizon operator + fixed-point horizon-planning loop + offset-planning stage. Source: sources/2026-04-21-planetscale-optimizing-query-planning-in-vitess-a-step-by-step-approach.
- Apache Spark Catalyst — rule-based optimiser with the same
invariant: every rule takes a
LogicalPlanand returns aLogicalPlan, and the framework loops until fixed point. Spark's "transforms once" vs. "transforms until convergence" distinction is the explicit fixed-point switch. - Apache Calcite HepPlanner — rule engine over
RelNodetrees; same runnable-plan invariant between rule applications. - Generic compiler theory — nanopass compilers (Sarkar/Dybvig) are
the functional-programming analogue: every pass has
IR_n → IR_{n+1}type, every intermediate IR is executable.
Anti-patterns¶
- Passing partially-rewritten state via mutable arguments — breaks the invariant; intermediate "plans" are no longer runnable because they depend on caller-side state to be meaningful.
- Rewriters that need to undo a previous rewrite — forces ad-hoc rule ordering (do A before B or else) and breaks fixed-point termination. Use patterns/phase-gated-planner-rewriter instead.
- Conflating plan rewriting with execution-engine-specific annotations — mixing row-layout offsets into the rewriting phases couples the planner to the execution engine's row format. Separate out a post- rewrite stage like concepts/offset-planning.
Relationship to¶
- concepts/runnable-plan-at-every-step — the invariant this pattern embodies.
- concepts/fixed-point-tree-rewriting — convergence discipline within a phase.
- concepts/planner-phase-ordering — sequencing across phases.
- concepts/horizon-operator — placeholder-operator technique used to preserve the invariant for deferred planning decisions.
- concepts/offset-planning — the executor-specific post-rewrite stage that completes the pipeline.
- patterns/phase-gated-planner-rewriter — the companion pattern for resolving rewriter interference without breaking the invariant.
Seen in¶
- sources/2026-04-21-planetscale-optimizing-query-planning-in-vitess-a-step-by-step-approach — canonical source. Andrés Taylor's 2023-06-01 post is the canonical wiki disclosure of the new-model Vitess planner as an instance of this pattern.