SYSTEM Cited by 1 source
FBDetect¶
Definition¶
FBDetect is Meta's in-house performance-regression-detection tool that can catch regressions as small as 0.005 % in noisy production environments and surfaces "thousands of regressions weekly" across Meta's fleet (Source: sources/2026-04-16-meta-capacity-efficiency-at-meta-how-unified-ai-agents-optimize-performance-at-hyperscale). It is the defense arm of Meta's Capacity Efficiency program — the system that decides "a regression happened" and attributes it to a root-cause pull request, handing off to downstream resolution (traditionally a human; since 2026 also the AI Regression Solver).
Architectural detail lives primarily in the SOSP 2024 paper: tangchq74.github.io/FBDetect-SOSP24.pdf.
What's disclosed in the 2026-04-16 Engineering post¶
- Regression sensitivity: "as small as 0.005 % in noisy production environments." At Meta's 3 B+ user scale, 0.1 % regressions already translate to "significant additional power consumption," so micro-regression sensitivity is load-bearing.
- Throughput: "thousands of regressions weekly."
- Input domain: time-series data from production.
- Root-cause attribution: "primarily traditional techniques such as correlating regression functions with recent pull requests." Hand-rolled correlation, not ML — the ML component sits downstream in the AI Regression Solver, not in the detector itself.
- Output contract: a regression event + a candidate root-cause PR → engineer notification (traditionally) or the AI Regression Solver (since 2026).
Position in Meta's operational-AI stack¶
FBDetect is the detector; the AI Regression Solver is the responder. Meta explicitly names this separation: "After a root cause is determined, engineers are typically notified and expected to take action, such as optimizing the recent code change. We've added an additional feature to make this faster: AI Regression Solver."
Before 2026: - FBDetect → detect regression + attribute root-cause PR → notify engineer → engineer investigates + writes mitigation (hours) or PR is rolled back (velocity cost) or ignored (capacity cost).
After 2026: - FBDetect → detect + attribute → AI Regression Solver → fix-forward PR sent to original root-cause author for review → engineer reviews in minutes.
Why it matters at program scale¶
Meta ties FBDetect's throughput to the megawatt metric directly: "Meta's in-house regression detection tool, catches thousands of regressions weekly; faster automated resolution means fewer megawatts wasted compounding across the fleet." The compounding framing is the load-bearing economic argument — each undetected / slow-resolved regression is paid in ongoing fleet capacity, not a one-time cost.
What's NOT disclosed in the post¶
Most FBDetect internals live in the SOSP 2024 paper, not this post:
- Time-series decomposition + statistical model underlying the 0.005 % sensitivity claim.
- Change-point detection algorithms used.
- False-positive rate on the regression stream.
- Storage / indexing for the time-series substrate.
- Signal ingestion path (how function-level metrics reach FBDetect — presumably Strobelight-class profiling plus service-level counters).
- Root-cause-PR correlation algorithm.
Seen in¶
- sources/2026-04-16-meta-capacity-efficiency-at-meta-how-unified-ai-agents-optimize-performance-at-hyperscale — canonical Meta Engineering disclosure.
- SOSP 2024 paper (linked, not ingested on the wiki): tangchq74.github.io/FBDetect-SOSP24.pdf.
Related¶
- companies/meta
- systems/meta-ai-regression-solver — the responder agent on top
- systems/meta-capacity-efficiency-platform — the unified platform FBDetect is defense arm of
- systems/strobelight — Meta's profiling orchestrator; likely data source
- systems/meta-rca-system — sibling Meta detection/investigation system (web-monorepo incidents vs performance-regression)
- concepts/capacity-efficiency
- concepts/offense-defense-performance-engineering
- patterns/ai-generated-fix-forward-pr