CONCEPT Cited by 1 source
False-positive / false-negative asymmetry¶
Definition¶
False-positive / false-negative asymmetry is the design property of a classification or membership system where the cost of a false positive and the cost of a false negative are structurally different — and choosing a substrate that can only be wrong in the cheaper direction buys correctness on the expensive direction.
Concretely, for a binary oracle answering "is x in the set?":
- False positive (oracle says yes, truth says no): the caller proceeds to a slower authoritative check that correctly returns "no". Cost = one extra check per false positive. Usually bounded, predictable, and small.
- False negative (oracle says no, truth says yes): the caller treats x as genuinely absent. Cost = whatever the system does for "absent", applied to a genuinely present element. Often unbounded, user-visible, or structurally wrong.
When the two costs are dramatically asymmetric, the right substrate is one whose error mode matches the cheap direction. A Bloom filter has false positives but no false negatives — correct choice when false negatives are expensive and false positives are cheap.
Canonical Vercel instance¶
Vercel's 2026-04-21 blog post surfaces the asymmetry explicitly in the build-output path lookup:
"Bloom filters can return false positives, but never false negatives. For path lookups, this property is valuable. If the Bloom filter says a path does not exist, we can safely return a 404; if it says a path might exist, we fall back to checking the build outputs."
And the cost model:
"We can't afford false negatives (returning 404 for valid pages), and Bloom filters guarantee this won't happen. False positives just trigger an extra storage request to find the file doesn't exist."
(Source: sources/2026-04-21-vercel-how-we-made-global-routing-faster-with-bloom-filters.)
The cost ratio:
- False positive → one extra (successful) storage fetch that confirms the 404. Routine operation; milliseconds of added latency for one request.
- False negative → an indexed, linked, user-requested real page returns 404. SEO damage, broken links, user trust erosion, availability incident.
The asymmetry is orders of magnitude — which is why a Bloom filter is unambiguously the right substrate and the exact JSON tree's correctness guarantee is worth trading for its parse-time latency win.
The design move¶
Once the asymmetry is named:
- Classify the failure modes of the current authoritative structure: where are its false positives vs false negatives? If it's exact, what's the cost model of its queries?
- Find the cost-dominant path — is the common case (negative lookups, the "no" answer) fast enough? Or is the slow "no" blocking a hot path?
- Substitute a probabilistic fast-negative whose error mode lies in the cheap direction. Compose with a fallback authoritative check on "maybe".
- Size the filter for its false-positive cost budget —
pchosen so that FP-caused fallback lookups are a small fraction of total request traffic.
Examples in the corpus¶
| System | Cheap direction | Expensive direction | Substrate |
|---|---|---|---|
| Vercel routing | false positive (extra storage fetch → 404) | false negative (wrongly 404 a valid page) | concepts/bloom-filter |
| Chrome malicious-URL filter | false positive (extra server check) | false negative (visit a known-bad URL) | concepts/bloom-filter |
| Column-store pruning | false positive (scan one more block) | false negative (miss matching rows) | concepts/bloom-filter + zone maps |
| Fraud detection | false positive (review a legit tx) | false negative (approve a fraud tx) | ML classifier, tuned precision/recall |
| Spam detection | false positive (quarantine a real email) | false negative (deliver spam) | depends on cost model |
| Medical screening | false positive (extra test, stress) | false negative (missed diagnosis) | depends; often symmetric |
| Content moderation | false positive (block legit content) | false negative (miss harmful content) | depends; often symmetric |
The first three have strongly asymmetric costs and admit simple probabilistic solutions. The bottom three have contested cost models — false positives aren't cheap — and so can't be reduced to a single-shape decision.
Distinct from false-positive management¶
concepts/false-positive-management names the operational discipline of keeping false positives tolerable when they're the error mode you accepted: allowlisting, triage workflows, measuring FP rate. This concept names the design-time choice of which error mode to accept in the first place.
The two compose: you pick the substrate whose error mode matches your cost asymmetry; then you run false-positive management on the error mode you accepted.
Anti-patterns¶
- Treating the two errors as symmetric when they aren't. Leads to over-engineered exact structures (the Vercel JSON tree) or over-eager probabilistic filters (Chrome's early phishing-URL filter was too coarse, caused legitimate-site FPs, required allowlist expansion).
- Choosing a probabilistic structure whose error mode is the wrong direction. A counting Bloom filter permits both FP and (under deletion) false negatives from decrement bugs; inappropriate for 404 filters.
- Not sizing
pto the fallback cost budget. A 1 % false-positive rate might be fine for a disk-cache filter (40 extra disk seeks per 4000 queries) but disastrous for a fraud-detection filter (400 false-fraud review flags per 40,000 transactions). - Conflating error mode with error magnitude. A structure with higher precision but symmetric error isn't strictly better than one with lower precision but asymmetric error aligned with cost.
Seen in¶
-
sources/2026-04-21-vercel-how-we-made-global-routing-faster-with-bloom-filters — Canonical wiki introduction. Vercel's routing-service Bloom-filter substitution makes the asymmetry verbatim: false negative = "return 404 for valid pages" (unbounded damage), false positive = "extra storage request to find the file doesn't exist" (bounded, ~one storage roundtrip).
-
concepts/bloom-filter — The canonical data structure whose error mode is false-positive-only; the Vercel case is the canonical design application.
-
concepts/false-positive-management — Operational companion: manage the FP rate of whichever error mode the substrate commits to.
-
patterns/two-stage-evaluation — Composes with this asymmetry: stage 1 is the probabilistic oracle whose cheap-direction error is tolerated; stage 2 is the authoritative fallback invoked only on "maybe".