Skip to content

SYSTEM Cited by 1 source

SQLsmith

SQLsmith is an open-source random SQL query generator designed to find bugs in SQL database engines by emitting syntactically-valid (and frequently semantically-interesting) SQL that exercises the engine across a wide cross-section of features. Unlike differential testers that compare engine output to an oracle, SQLsmith is primarily a stress / crash / assertion- violation finder — it runs queries against the engine and watches for crashes, internal errors, sanitiser hits, and unexpected behaviour.

GitHub: github.com/sqlancer/sqlancer (repo houses both SQLancer and SQLsmith historically; SQLsmith also has a standalone repo).

What SQLsmith generates

  • Random SELECT queries with joins, subqueries, CTEs, set operations, window functions, aggregates.
  • Type-aware expressions — generates expressions whose types match expected operator signatures.
  • Catalog-aware — uses the database's actual schema (tables, columns, types) so generated queries reference real objects.
  • Configurable feature subset — generators can be limited to features the target supports.

Where SQLsmith is used

SQLsmith has been highly effective on Postgres, SQLite, MonetDB, and other engines — it has discovered hundreds of bugs in upstream Postgres alone (planner crashes, executor bugs, data corruption under concurrent operations).

In the Lakebase release-gate regime

Verbatim from the systems/lakebase reliability roadmap (Source: sources/2026-05-27-databricks-how-the-lakebase-architecture-stays-resilient-to-cloud-failures):

"We utilize open source tools like SqlLancer and SqlSmith, along with similar internal tools, to verify correct Postgres behavior. While failure injection is running, we validate internal data consistency, that no committed transaction is lost, and that every component recovers to a consistent state on its own."

The pairing with SQLancer is structural — SQLancer finds logic bugs (incorrect result sets relative to a SQL-standard oracle) while SQLsmith finds crashes / assertion-violations under random query loads. Run together they exercise both correctness dimensions:

Tool Bug class Mechanism
SQLsmith Crashes, internal errors, assertion violations Random query generation, run-and-watch
SQLancer Logic bugs (wrong result set vs oracle) PQS / NoREC / TLP differential techniques

Both are run while the chaos / fault-injection regime is running — so the workload exercises engine behaviour during process kills, network partitions, disk wipes, and failpoint-driven errors. The combined harness validates that the database returns correct results and the engine doesn't crash, while faults are being injected.

Caveats

  • No oracle. SQLsmith doesn't know what the right answer is — only that the engine crashed or violated an assertion. Logic bugs that don't crash are invisible; that's SQLancer's job.
  • Coverage is heuristic. Random generation gets broad coverage but not exhaustive — corner cases that only fire under specific shapes can be missed.
  • Schema dependence. Test schemas are usually small; bugs that only manifest under industry-realistic schemas may need workload-specific generators.

Seen in

Last updated · 542 distilled / 1,571 read