Skip to content

PLANETSCALE 2024-04-09 Tier 3

Read original ↗

PlanetScale — Summer 2023: Fuzzing Vitess at PlanetScale

Summary

Intern retrospective by Arvind Murty (PlanetScale / Vitess, published 2024-04-09, covering work done May–July 2023 under Andrés Taylor) on building a random-query fuzzer for the VTGate query planner. The post is the first wiki ingest canonicalising Vitess's planner-level fuzzing strategy — companion to the evalengine fuzzer (AST-vs-VM-vs-MySQL three-way oracle, canonicalised by sources/2025-04-05-planetscale-faster-interpreters-in-go-catching-up-with-cpp) but operating at a different altitude: evalengine fuzzing tests scalar expression evaluation, planner fuzzing tests whole-query planning + execution against sharded storage. Core architectural disclosures: (1) SQLancer is wrong substrate for MySQL-compatible DBMSes"Vitess ideally should perfectly mimic MySQL, quirks included. SQLancer on the other hand compares queries to an oracle, which determines if queries are logically correct." The distinction is load- bearing: SQLancer's three techniques (PQS, NoREC, TLP) detect logic bugs by testing whether the DBMS returns mathematically correct results; a MySQL-compatible engine like Vitess must instead return results identical to MySQL's, quirks and all. (2) VSchema adds a planner- layer axis SQLancer can't exercise"Vitess has the added layer of the VSchema. The VSchema has many added considerations, such as sharding keys, which changes how Vitess plans the query." Sharding strategy is a first-class planner input, so any fuzzer must generate queries against a known VSchema and route the oracle (MySQL) against an equivalent unsharded schema. (3) Bespoke random query generator wins on low-hanging fruit — "We decided to go for the low-hanging fruit and build our own random query generator. Which turned out to not be that low-hanging since it yielded a bunch of failing queries." Shipped as vitessio/vitess #13260 on top of Andrés Taylor's earlier aggregate/GROUP BY/ ORDER BY/LIMIT fuzzer. (4) Differential oracle: "Once the query is generated, it's run on both Vitess and MySQL, and the results and errors are compared. If there is a mismatch, it is reported." This is the canonical MySQL- compatible differential fuzzing shape — target DBMS vs reference DBMS, identical schema, query passed through both. (5) Query simplifier as AST-rewriting delta-debugger — Andrés Taylor's prior query simplifier (see Andres' 2022 post) is a brute-force AST-node-removal loop: "It uses a brute- force approach, removing or modifying nodes in the AST and checking if the new, simpler query still exhibits the same error. If it does, the simplifier is called on the new query." Murty's contribution (vitessio/vitess #13636) adapted it to supply VSchema information for end-to-end tests and refactored. (6) Random-expression generator semantic constraints — the generator had to learn where aggregations are legal in a SQL parse tree: "Because aggregation can only exist in the SELECT statement or the GROUP BY, ORDER BY, and HAVING clauses, I had to make sure the generator only produced aggregations for the statements and clauses in which they are allowed." Murty also added derived tables (subqueries as tables in the FROM clause) and columns from derived tables to the expression generator's column pool.

Key takeaways

  • Fuzzer oracle = reference DBMS, not logical-correctness oracle"Vitess ideally should perfectly mimic MySQL, quirks included." SQLancer-style logic-bug testing is the wrong substrate for MySQL- compatible engines; the oracle must be MySQL itself. Canonicalised as concepts/sqlancer-logic-bug + patterns/mysql-compatible-differential-fuzzing.
  • Planner-layer fuzzing is a different altitude from expression-evaluator fuzzing. The [[systems/vitess- evalengine|evalengine]] fuzzer tests scalar expression evaluation (AST interpreter vs VM vs MySQL C++ engine); this fuzzer tests whole-query planning + execution (Vitess plan-then-execute vs MySQL direct execute) against sharded tables. Both are load-bearing; the combination canonicalises a two-altitude Vitess fuzzing posture.
  • VSchema is a planner-layer axis. Vitess's sharding metadata (sharding keys, vindexes, tablet routing) is a first-class input to the planner. The fuzzer must (a) generate queries against a known VSchema and (b) route the oracle (MySQL) against an equivalent unsharded schema so that mismatches indicate planner bugs, not schema divergences.
  • Query simplifier is a brute-force AST-node-removal delta-debugger. "Removing or modifying nodes in the AST and checking if the new, simpler query still exhibits the same error" is the delta-debugging technique applied to SQL ASTs. Originally designed for unit tests with known schemas; Murty extended it to Vitess end-to-end tests by threading VSchema through. Canonicalised as concepts/query-simplifier.
  • Random SQL expression generation has semantic-role constraints. Aggregations can appear in SELECT, GROUP BY, ORDER BY, HAVING but not WHERE or other expression contexts. Derived-table column generation requires tracking the derived table's schema. These constraints mean a random expression generator is not a pure context-free random-tree walker — it must be semantically aware of where each AST node type is legal.
  • Test corpus is EMP/DEPT sharded on EMPNO/DEPTNO. Industry-standard Oracle-era employee/department sample schema, sharded by the natural integer primary keys. Future work called out: randomise schema + VSchema to surface planner bugs specific to richer sharding layouts (non-integer vindexes, composite vindexes, lookup vindexes).
  • Operational flag testFailingQueries documents known-failing query shapes the fuzzer skips. "With the query planner being improved since I completed my work on the fuzzer, this flag can either be deleted altogether, or at the very least be removed from many spots." — typical fuzzer-hygiene debt: known-failure skip lists calcify as the target evolves.

Systems mentioned

  • systems/vitess — the target DBMS under fuzz. VTGate query planner is the component being exercised; systems/vitess-evalengine is the sibling fuzzer's target (scalar expressions, not whole plans).
  • systems/mysql — the reference DBMS + differential- oracle anchor. "It's run on both Vitess and MySQL, and the results and errors are compared."
  • systems/planetscale — the internship host and the Vitess-management vendor whose blog published the post.
  • systems/sqlancer — canonical DBMS-logic-bug fuzzer (SQLancer GitHub) ruled out for Vitess on two grounds: oracle-based logic-correctness testing vs. MySQL-bug-for-bug compatibility, and VSchema sharding not modelled. Minimal-viable system page added for cross-ref anchoring.

Concepts extracted

  • concepts/query-simplifier — AST-node-removal delta- debugging for SQL queries: brute-force remove or modify nodes, check if error persists, recurse. Originally built for Vitess planner unit tests; extended by Murty for end-to-end tests via VSchema injection.
  • concepts/sqlancer-logic-bug — the SQLancer-style bug class: a query returns logically incorrect results (missing rows, wrong aggregates) relative to a mathematical oracle (PQS, NoREC, TLP). Distinct from MySQL-compatibility bugs where the target DBMS returns results that differ from MySQL's (quirks-and-all) even if MySQL's own result is arguably logically incorrect.
  • concepts/vtgate-query-plannerextended: new Seen-in canonicalising whole-query planner fuzzing as correctness-probing mechanism; the 2023 summer fuzzer is the first wiki ingest disclosing planner-altitude random-query differential testing.
  • concepts/abstract-syntax-treeextended: new Seen-in canonicalising AST-node-removal as the delta- debugging primitive beneath concepts/query-simplifier.

Patterns extracted

  • patterns/mysql-compatible-differential-fuzzing — generate random queries, run on target DBMS and reference DBMS (MySQL) against equivalent schemas, compare results and errors byte-for-byte; any mismatch is a target-DBMS bug (or, rarely, a MySQL bug worth upstreaming). Distinct from patterns/fuzz-ast-vs-vm-oracle (two in-process interpreters of the same language) and from SQLancer- style logic-bug testing (target DBMS vs. mathematical oracle).
  • patterns/fuzz-ast-vs-vm-oraclecross-ref only: evalengine's sibling fuzzer uses this shape at scalar- expression altitude with MySQL as a third oracle.
  • patterns/ast-based-query-generationcross-ref only: shares the parse-tree-IR shape with the fuzzer's random-query emitter, but goes query-document → AST → backend-query, whereas this fuzzer goes (schema + VSchema) → AST → serialised SQL. Distinct shapes, same substrate.

Operational numbers

Caveats

  • Intern-retrospective voice, not a deep architectural post. Author wrote the post ~9 months after leaving the internship; the post is narrative-first, not a spec. No benchmark numbers (queries fuzzed per hour, bugs found, bug-severity breakdown, coverage metrics).
  • No enumeration of specific bugs found. The post says "it yielded a bunch of failing queries" without naming the shapes. The PRs linked give some signal but the retrospective elides the bug-taxonomy detail that would make this a load-bearing architectural post.
  • Short post (~1,200 words) — borderline Tier-3 scope case. Architecture density ~55% (decent for the length). Passes on the substance of the SQLancer-is-wrong- substrate framing + VSchema-as-planner-axis disclosure + query-simplifier-as-AST-delta-debugger canonicalisation — three specific architectural claims the wiki did not previously have canonical homes for.
  • SQLancer critique is incomplete. SQLancer does have a TiDB dialect and a MySQL dialect; the claim "it would take a lot of work to properly integrate Vitess with SQLancer" is specifically about Vitess-as-new-DBMS-tester, not about SQLancer's DBMS coverage broadly. The VSchema-not-modelled critique is load-bearing and stands.
  • EMP/DEPT-only corpus is explicitly acknowledged as a weakness: "I think the first step that should be taken is complicating or randomizing the schema and VSchema." The fuzzer as shipped only exercises one sharding layout.
  • testFailingQueries flag calcification — typical fuzzer-maintenance debt. Author calls for removal but the post doesn't disclose how many query shapes the flag currently skips.
  • No comparison to evalengine fuzzer — the post doesn't reference the sibling expression-evaluator fuzzer Andrés Taylor and Vicent Martí built in the same era. The two altitudes are complementary; the wiki canonicalises the composition via this ingest.
  • Author attribution: "Thank you to Andrés for spearheading this project and mentoring me along the way. Thanks to Harshit, Florent, and Manan." — Vitess query- serving team (Andrés Taylor, Harshit Gangal, Florent Poinsard, Manan Gupta) is the organisational substrate.

Source

Last updated · 470 distilled / 1,213 read