Skip to content

PATTERN Cited by 1 source

MySQL-compatible differential fuzzing

Problem

You're building a SQL engine that must be bug-for-bug compatible with MySQL — a MySQL-protocol proxy, a MySQL-wire-compatible clone, or a sharding layer like Vitess that sits in front of real MySQL shards. Your engine has its own query planner, executor, and expression evaluator, but clients expect their SQL to return exactly the same result MySQL would — including MySQL's quirks, edge cases, and questionable coercions.

How do you continuously probe correctness across the hundreds of thousands of SQL query shapes real workloads emit, including the shapes your regression suite never thought to include?

Why logic-bug fuzzers don't work

SQLancer and similar logic-bug fuzzers use a mathematical oracle — PQS, NoREC, TLP, etc. — to detect queries whose results are logically incorrect per the SQL standard. This approach works great on standards-compliant DBMSes but fails on MySQL-compatible engines for two reasons:

  1. MySQL has quirks. A MySQL-compatible engine that correctly mimics a MySQL quirk will be flagged as a logic bug; an engine that doesn't mimic the quirk will be flagged as correct even though it fails the compatibility contract its users care about. The oracle is wrong.
  2. Sharding changes the query. If your engine shards tables across backends, the oracle must account for sharding layout. SQLancer doesn't model sharding, so sharding-specific planner bugs go undetected.

See concepts/sqlancer-logic-bug for the full taxonomy.

Solution: use MySQL itself as the oracle

Fuzz shape:

  1. Known fixture schema. Define a fixed schema (tables
  2. columns + types) and, for sharded engines, a fixed VSchema (shard keys, vindexes). Load equivalent data into the target engine and into a reference MySQL instance.
  3. Random query generator. Produce random queries against the schema. The generator walks a query template grammar emitting random clauses (SELECT, FROM, WHERE, GROUP BY, ORDER BY, HAVING, LIMIT) with random tables, columns, predicates, expressions, joins, derived tables. Aggregations are constrained to their semantically-legal positions (SELECT, GROUP BY, ORDER BY, HAVING — not WHERE or other expression contexts).
  4. Differential execution. Send each generated query to both the target engine and MySQL. Collect results and any errors.
  5. Byte-for-byte comparison. Compare row sets, column types, ordering (when the query specifies it), and error classes. Any divergence is a candidate bug.
  6. Minimal reproducer. Feed divergent queries to a query simplifier that reduces the query to its minimal form preserving the divergence. File / triage the minimal repro.

Canonical instance: Vitess planner fuzzer

Arvind Murty's summer 2023 PlanetScale internship built this pattern for Vitess (vitessio/vitess #13260):

"From a given set of tables, the fuzzer randomly selects a multiset of the tables, then chooses a random multiset of columns to provide to the clauses (SELECT, GROUP BY, WHERE, etc.) and the random expression generator. Once the query is generated, it's run on both Vitess and MySQL, and the results and errors are compared. If there is a mismatch, it is reported." (Source: sources/2026-04-21-planetscale-summer-2023-fuzzing-vitess-at-planetscale)

Fixture schema: EMP (employee) + DEPT (department) tables from the Oracle-era sample schema. Sharded in Vitess by EMPNO / DEPTNO; unsharded in the reference MySQL instance so the target result set is identical.

The fuzzer pairs with Andrés Taylor's pre-existing query simplifier to produce minimal repros for every divergence found.

Distinct from adjacent patterns

vs. patterns/fuzz-ast-vs-vm-oracle

Both patterns differentially fuzz two implementations of the same SQL semantics. The AST-vs-VM pattern uses two in-process interpreters of the same language — the old tree-walking interpreter and the new bytecode VM in Vitess's evalengine. This pattern uses two separate DBMSes — Vitess's full query-planning-and-execution pipeline vs. MySQL's.

Altitude differs: AST-vs-VM tests scalar expression evaluation (does SUBSTRING('foo', -10) return the same thing on both?); MySQL-compat testing tests whole-query planning + execution (does SELECT … GROUP BY … ORDER BY … LIMIT … with this schema and these joins return the same rows?).

Vitess runs both fuzzers simultaneously at different layers — evalengine's AST-vs-VM oracle for expression-level semantics, planner fuzzer's Vitess-vs-MySQL oracle for whole-query behaviour.

vs. SQLancer logic-bug oracle

SQLancer's oracle is mathematical — derived from the SQL standard's semantics. This pattern's oracle is operational — derived from what reference MySQL actually does. A standards-compliant bug in reference MySQL propagates into the oracle here but would be flagged by SQLancer. Converse: a MySQL quirk the target fails to replicate is invisible to SQLancer but caught here.

vs. patterns/ast-based-query-generation

AST-based-query-generation is the user-input-to-backend- query direction (parse a user's query DSL, lower to AST, emit a backend query). This pattern goes the other direction: random-AST-to-serialised-SQL (generate a random AST against a known schema, serialise to SQL, send to both engines). Same AST substrate, opposite directionality.

When this pattern applies

  • The target engine is protocol-compatible with an existing reference DBMS (MySQL, Postgres, DynamoDB, Redis) and the compatibility contract requires bug-for-bug equivalence.
  • The target engine has its own query planner / executor / expression evaluator independent from the reference DBMS.
  • Regression tests alone are insufficient to cover the real-workload query surface.

Examples beyond Vitess: - Postgres-wire-compatible engines (CockroachDB, Materialize, Yugabyte) vs. reference Postgres. - DynamoDB clones (LocalStack DynamoDB Local) vs. real DynamoDB. Here the "reference" is a cloud service, so the differential must be recorded rather than live. - Redis-protocol clones (KeyDB, DragonflyDB) vs. reference Redis.

Requirements

  • Fixed equivalent schemas on both engines; for sharded targets, an unsharded schema on the reference.
  • Query generator that respects SQL grammar and semantic-role constraints (aggregation only in legal positions, etc.).
  • Deterministic comparison logic for result sets: row ordering rules, NULL handling, column type coercion rules.
  • Error-class equivalence: both engines errored / both succeeded / both returned same rows is OK; any other shape is a bug.
  • A query simplifier to convert noisy fuzzer-output repros into minimal filable bugs.

Seen in

Last updated · 470 distilled / 1,213 read