CONCEPT Cited by 1 source
Test Case Minimization¶
Definition¶
Test case minimization (also: shrinking) is the step in a property-based testing workflow where, once a random input causes a test failure, the framework iteratively reduces the input — by removing elements, lowering numeric values, simplifying structures — and re-runs, keeping the input simple enough to still reproduce the failure. The result is a minimally-complex reproducer that isolates the bug from the random noise it was buried in.
Named and originated by QuickCheck (Haskell, Claessen & Hughes, 2000); now a standard feature of modern property-based testing libraries (Hypothesis, proptest, fast-check, test.check).
Canonical motivation: a random-generated test that fails on an input of "37 nested nodes across three trees with a specific move+rename interaction" becomes — after minimization — "3 nodes: a local move combined with a remote move creating a cycle." The latter is a diagnosable bug report; the former is a grep through a log file.
Why this matters¶
Tracking dozens of nodes across three trees is simply not easy for a human to do, as the real bug is often hidden alongside many red herrings. This minimization removes such distractions, often making it much easier to diagnose at a glance.
Minimization is the feature that turns property-based testing from a "bug-finding machine with painful output" into a "bug-finding machine with diagnosable output." Its absence is what has historically prevented end-to-end fuzzing/stochastic testing from displacing hand-written unit tests: a 3-hour flaky-test failure with a 500-line log is worse than no test at all, because engineers learn to ignore it.
The shrinking procedure¶
The exact procedure depends on input shape, but the general form:
- Test failed on input
X. Save (seed, input). - Generate a "smaller" candidate
X'— remove one element, halve a numeric range, drop one branch of a tree, etc. - Re-run the test with
X'. If it still fails,X := X', go to step 2. If it passes, try a different smaller candidate. - Stop when no smaller candidate reproduces the failure.
The result X is a local minimum — often, but not always, a
global minimum.
Canonical realization: CanopyCheck (Dropbox)¶
systems/canopycheck's inputs are three trees. When a random seed fails an invariant:
- Iteratively remove nodes from the initial trees.
- Re-run (same planner, same seed-driven operation ordering).
- Keep shrinking if the failure persists.
The simple-typed input shape (three trees, nothing else) is what makes this work — every candidate "smaller" input is obviously a valid test case.
Why it fails end-to-end (Trinity)¶
systems/trinity — Dropbox's end-to-end concurrency-testing framework — cannot easily minimize, and the post names this as an explicit limitation:
Even a small change to the three trees' initial state might change the order of network request scheduling down the line, and thereby invalidate a failing random seed — hard-won at the cost of many CPU-days!
The problem: Trinity's coverage depends on mocks that inject failures and reorder RPCs throughout execution. The scheduling of those failures is a function of the PRNG state, which is a function of every decision made so far. Perturbing the initial state changes the chain of PRNG consumption, which changes the schedule, which almost always makes the failure go away — but not because the bug is fixed, just because a different schedule is being tested.
Candidate fix noted in the post: "decouple the global PRNG into several independent ones." A per-subsystem PRNG (one for filesystem faults, one for network reordering, one for time, one for input generation) would let you perturb the initial-state PRNG without invalidating the schedule PRNG, making shrinking on input feasible again. As of the post, this was being considered; no outcome captured.
Seen in¶
- sources/2024-05-31-dropbox-testing-sync-at-dropbox-2020 — canonical industry reference for why minimization works great on narrow property tests (CanopyCheck) and doesn't scale to end-to-end deterministic simulation (Trinity), with the per-subsystem-PRNG escape hatch.