Skip to content

CONCEPT Cited by 1 source

Capacity efficiency

Definition

Capacity efficiency is the engineering discipline of reducing compute, memory, power, and capacity demand per unit of product value at hyperscale. At Meta's scale — "more than 3 billion people""even a 0.1% performance regression can translate to significant additional power consumption," so capacity efficiency is a first-class engineering function, not a periodic optimization pass (Source: sources/2026-04-16-meta-capacity-efficiency-at-meta-how-unified-ai-agents-optimize-performance-at-hyperscale).

The two sides

Meta's Capacity Efficiency program is explicitly two-sided:

  • Offense: "searching for opportunities (proactive code changes) to make our existing systems more efficient, and deploying them."
  • Defense: "monitoring resource usage in production to detect regressions, root-cause them to a pull request, and deploy mitigations."

Both sides pay in the same unit — megawatts of fleet power — but neither side is the whole picture: defense without offense protects what you have, offense without defense leaks gains back through regressions.

Why it's a program, not a project

The named constraint is human engineering time. Engineers have to:

  • Query profiling data to find optimization candidates.
  • Review opportunity descriptions, documentation, and past examples.
  • Check recent code / configuration deployments for step-changes in resource usage.
  • Check recent internal discussions for launch-correlated regressions.

"Many engineers at Meta use our efficiency tools to work on these problems every day. But no matter how high-quality the tooling is, engineers have limited time to address performance issues when innovating on new products is our top priority." Capacity efficiency as a program competes for the same engineer-hours as product work, so scaling megawatt delivery without proportionally scaling headcount becomes the operating constraint.

How Meta measures it

  • Absolute: "hundreds of megawatts of power" recovered — "enough to power hundreds of thousands of American homes for a year." Program-level metric.
  • Regression-rate: "thousands of regressions weekly" caught by FBDetect; "fewer megawatts wasted compounding across the fleet" is the economic model.
  • Investigation-time compression: "~10 hours of manual investigation into ~30 minutes"~20× — as the direct bottleneck-lifting metric.
  • Service-level (from the 2025-03-07 Strobelight post): "up to 20 % reduction in CPU cycles" for top-200-services FDO rollouts, "15,000 servers/year" for a single hot-path fix.

Relationship to other wiki concepts

  • Distinct from: cost attribution / chargeback (chargeback) — capacity efficiency is about reducing the absolute cost, chargeback is about assigning the current cost to its driver.
  • Complementary to: rack-level power density (concepts/rack-level-power-density) — one is software-layer efficiency, the other is infrastructure-layer density. Both bear on total-fleet-power.
  • Enabled by: the profiling + regression-detection + code-index stack — Strobelight, systems/fbdetect, Glean. Without these, neither offense nor defense has input.

Why AI matters here specifically

Capacity efficiency is the textbook long-tail problem: - Each individual optimization is small (0.1 %, one hot function, one service). - The fleet has millions of such opportunities. - The cost of a human investigating any one of them exceeds the per-opportunity payoff.

AI that compresses investigation-time by 20× (or equivalently multiplies per-engineer throughput by 20×) converts previously uneconomical optimizations into shipped fixes. Meta: "AI-assisted opportunity resolution is expanding to more product areas every half, handling a growing volume of wins that engineers would never get to manually." The self-sustaining engine is the target end-state.

Seen in

Last updated · 319 distilled / 1,201 read