CONCEPT Cited by 1 source
Capacity efficiency¶
Definition¶
Capacity efficiency is the engineering discipline of reducing compute, memory, power, and capacity demand per unit of product value at hyperscale. At Meta's scale — "more than 3 billion people" — "even a 0.1% performance regression can translate to significant additional power consumption," so capacity efficiency is a first-class engineering function, not a periodic optimization pass (Source: sources/2026-04-16-meta-capacity-efficiency-at-meta-how-unified-ai-agents-optimize-performance-at-hyperscale).
The two sides¶
Meta's Capacity Efficiency program is explicitly two-sided:
- Offense: "searching for opportunities (proactive code changes) to make our existing systems more efficient, and deploying them."
- Defense: "monitoring resource usage in production to detect regressions, root-cause them to a pull request, and deploy mitigations."
Both sides pay in the same unit — megawatts of fleet power — but neither side is the whole picture: defense without offense protects what you have, offense without defense leaks gains back through regressions.
Why it's a program, not a project¶
The named constraint is human engineering time. Engineers have to:
- Query profiling data to find optimization candidates.
- Review opportunity descriptions, documentation, and past examples.
- Check recent code / configuration deployments for step-changes in resource usage.
- Check recent internal discussions for launch-correlated regressions.
"Many engineers at Meta use our efficiency tools to work on these problems every day. But no matter how high-quality the tooling is, engineers have limited time to address performance issues when innovating on new products is our top priority." Capacity efficiency as a program competes for the same engineer-hours as product work, so scaling megawatt delivery without proportionally scaling headcount becomes the operating constraint.
How Meta measures it¶
- Absolute: "hundreds of megawatts of power" recovered — "enough to power hundreds of thousands of American homes for a year." Program-level metric.
- Regression-rate: "thousands of regressions weekly" caught by FBDetect; "fewer megawatts wasted compounding across the fleet" is the economic model.
- Investigation-time compression: "~10 hours of manual investigation into ~30 minutes" — ~20× — as the direct bottleneck-lifting metric.
- Service-level (from the 2025-03-07 Strobelight post): "up to 20 % reduction in CPU cycles" for top-200-services FDO rollouts, "15,000 servers/year" for a single hot-path fix.
Relationship to other wiki concepts¶
- Distinct from: cost attribution / chargeback (chargeback) — capacity efficiency is about reducing the absolute cost, chargeback is about assigning the current cost to its driver.
- Complementary to: rack-level power density (concepts/rack-level-power-density) — one is software-layer efficiency, the other is infrastructure-layer density. Both bear on total-fleet-power.
- Enabled by: the profiling + regression-detection + code-index stack — Strobelight, systems/fbdetect, Glean. Without these, neither offense nor defense has input.
Why AI matters here specifically¶
Capacity efficiency is the textbook long-tail problem: - Each individual optimization is small (0.1 %, one hot function, one service). - The fleet has millions of such opportunities. - The cost of a human investigating any one of them exceeds the per-opportunity payoff.
AI that compresses investigation-time by 20× (or equivalently multiplies per-engineer throughput by 20×) converts previously uneconomical optimizations into shipped fixes. Meta: "AI-assisted opportunity resolution is expanding to more product areas every half, handling a growing volume of wins that engineers would never get to manually." The self-sustaining engine is the target end-state.
Seen in¶
- sources/2026-04-16-meta-capacity-efficiency-at-meta-how-unified-ai-agents-optimize-performance-at-hyperscale — canonical wiki disclosure of the program-level frame.
- sources/2025-03-07-meta-strobelight-a-profiling-service-built-on-open-source-technology — load-bearing tool for the offense side of the program.
- sources/2024-08-23-meta-leveraging-ai-for-efficient-incident-response — the operational-AI predecessor with the same closed-feedback-loop
- confidence-thresholding discipline.
Related¶
- concepts/offense-defense-performance-engineering — the two-sided frame
- concepts/encoded-domain-expertise — the skill primitive AI efficiency agents consume
- concepts/rack-level-power-density — infrastructure-layer sibling
- concepts/hyperscale-compute-workload — the workload class capacity efficiency applies to
- systems/meta-capacity-efficiency-platform — Meta's program infrastructure
- systems/fbdetect — the defensive detector
- patterns/feedback-directed-optimization-fleet-pipeline — a specific offensive pipeline (FDO via Strobelight + BOLT)